MexIvanov (Mex Ivanov)

liked 2 models 3 days ago

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated about 5 hours ago • 846k • • 1.15k

unsloth/MiniMax-M2-GGUF

Text Generation • 229B • Updated 4 days ago • 36.3k • 38

liked 2 models 10 days ago

AvitoTech/avision

Image-Text-to-Text • 7B • Updated 15 days ago • 355 • 20

AvitoTech/avibe

Text Generation • 8B • Updated 8 days ago • 6.88k • 39

reacted to tomaarsen's post with 🤗🚀 16 days ago

Post

3484

🤗 Sentence Transformers is joining Hugging Face! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face! Details:

Today, the Ubiquitous Knowledge Processing (UKP) Lab is transferring the project to Hugging Face. Sentence Transformers will remain a community-driven, open-source project, with the same open-source license (Apache 2.0) as before. Contributions from researchers, developers, and enthusiasts are welcome and encouraged. The project will continue to prioritize transparency, collaboration, and broad accessibility.

Read our full announcement for more details and quotes from UKP and Hugging Face leadership: https://huggingface.co/blog/sentence-transformers-joins-hf

We see an increasing wish from companies to move from large LLM APIs to local models for better control and privacy, reflected in the library's growth: in just the last 30 days, Sentence Transformer models have been downloaded >270 million times, second only to transformers.

I would like to thank the UKP Lab, and especially Nils Reimers and Iryna Gurevych, both for their dedication to the project and for their trust in myself, both now and two years ago. Back then, neither of you knew me well, yet you trusted me to take the project to new heights. That choice ended up being very valuable for the embedding & Information Retrieval community, and I think this choice of granting Hugging Face stewardship will be similarly successful.

I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!

1 reply

·

liked a model 29 days ago

neuphonic/neutts-air

Text-to-Speech • 0.7B • Updated 28 days ago • 42.4k • 731

liked a Space 29 days ago

255

NeuTTS-Air

☁

Generate speech from text using a reference audio sample

liked a model 2 months ago

google/embeddinggemma-300m

liked a dataset 6 months ago

nyuuzyou/clker-svg

Viewer • Updated May 1 • 256k • 36 • 4

liked a model 7 months ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 10.9k • 1.22k

reacted to takarajordan's post with 👍 8 months ago

Post

1881

Takara takes 3rd place in the {tech:munich} AI hackathon with Fudeno!

A little over 2 weeks ago @aldigobbler and I set out to create the largest MultiModal SVG dataset ever created, we succeeded in this and when I was in Munich, Germany I took it one step further and made an entire app with it!

We fine-tuned Mistral Small, made a Next.JS application and blew some minds, taking 3rd place out of over 100 hackers. So cool!

If you want to see the dataset, please see below.

takara-ai/fudeno-instruct-4M

liked a model 8 months ago

manycore-research/SpatialLM-Llama-1B

Text Generation • 1B • Updated Mar 21 • 309 • 986

reacted to merve's post with 🤗 8 months ago

Post

4362

So many open releases at Hugging Face past week 🤯 recapping all here ⤵️ merve/march-21-releases-67dbe10e185f199e656140ae

👀 Multimodal
> Mistral AI released a 24B vision LM, both base and instruction FT versions, sota 🔥 (OS)
> with IBM we released SmolDocling, a sota 256M document parser with Apache 2.0 license (OS)
> SpatialLM is a new vision LM that outputs 3D bounding boxes, comes with 0.5B (QwenVL based) and 1B (Llama based) variants
> SkyWork released SkyWork-R1V-38B, new vision reasoning model (OS)

💬 LLMs
> NVIDIA released new Nemotron models in 49B and 8B with their post-training dataset
> LG released EXAONE, new reasoning models in 2.4B, 7.8B and 32B
> Dataset: Glaive AI released a new reasoning dataset of 22M+ examples
> Dataset: NVIDIA released new helpfulness dataset HelpSteer3
> Dataset: OpenManusRL is a new agent dataset based on ReAct framework (OS)
> Open-R1 team released OlympicCoder, new competitive coder model in 7B and 32B
> Dataset: GeneralThought-430K is a new reasoning dataset (OS)

🖼️ Image Generation/Computer Vision
> Roboflow released RF-DETR, new real-time sota object detector (OS) 🔥
> YOLOE is a new real-time zero-shot object detector with text and visual prompts 🥹
> Stability AI released Stable Virtual Camera, a new novel view synthesis model
> Tencent released Hunyuan3D-2mini, new small and fast 3D asset generation model
> ByteDance released InfiniteYou, new realistic photo generation model
> StarVector is a new 8B model that generates svg from images
> FlexWorld is a new model that expands 3D views (OS)

🎤 Audio
> Sesame released CSM-1B new speech generation model (OS)

🤖 Robotics
> NVIDIA released GR00T, new robotics model for generalized reasoning and skills, along with the dataset

*OS ones have Apache 2.0 or MIT license

liked a model 8 months ago

docling-project/SmolDocling-256M-preview

Image-Text-to-Text • 0.3B • Updated Sep 17 • 361k • 1.59k

reacted to jasoncorkill's post with 👍 8 months ago

Post

3829

At Rapidata, we compared DeepL with LLMs like DeepSeek-R1, Llama, and Mixtral for translation quality using feedback from over 51,000 native speakers. Despite the costs, the performance makes it a valuable investment, especially in critical applications where translation quality is paramount. Now we can say that Europe is more than imposing regulations.

Our dataset, based on these comparisons, is now available on Hugging Face. This might be useful for anyone working on AI translation or language model evaluation.

Rapidata/Translation-deepseek-llama-mixtral-v-deepl

1 reply

·

liked a model 8 months ago

yandex/YandexGPT-5-Lite-8B-pretrain

8B • Updated Mar 31 • 898 • 212

reacted to tomaarsen's post with ❤️ 8 months ago

Post

6904

An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.

🇪🇺 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi
3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion
➡️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common.
⚙️ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported.
🔥 A new Pareto frontier (stronger *and* smaller) for multilingual encoder models
📊 Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight.
📝 Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code.

Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release
* EuroBERT/EuroBERT-210m
* EuroBERT/EuroBERT-610m
* EuroBERT/EuroBERT-2.1B

The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!

1 reply

·

liked a dataset 8 months ago

TuringsSolutions/MemoryVaccine120

Viewer • Updated Feb 28 • 121 • 31 • 2

liked a model 8 months ago

coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 5.3M • 3.15k

Mex Ivanov

AI & ML interests

Recent Activity

Organizations

MiniMaxAI/MiniMax-M2

unsloth/MiniMax-M2-GGUF

AvitoTech/avision

AvitoTech/avibe

neuphonic/neutts-air

NeuTTS-Air

google/embeddinggemma-300m

nyuuzyou/clker-svg

microsoft/bitnet-b1.58-2B-4T

manycore-research/SpatialLM-Llama-1B

docling-project/SmolDocling-256M-preview

yandex/YandexGPT-5-Lite-8B-pretrain

TuringsSolutions/MemoryVaccine120

coqui/XTTS-v2

Mex Ivanov

AI & ML interests

Recent Activity

Organizations

MexIvanov's activity

NeuTTS-Air