MobileLLM-R1 Collection MobileLLM-R1, a series of sub-billion parameter reasoning models • 6 items • Updated 7 days ago • 18
Encoders vs Decoders: the Ettin Suite Collection A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 32 items • Updated Jul 16 • 19
GLiNER-X Collection The Multilingual Named Entity Recognition (NER) model which is capable of identifying any entity type. • 6 items • Updated Jun 24 • 21
view article Article Transformers backend integration in SGLang By marcsun13 and 4 others • Jun 23 • 53
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 8 days ago • 54
view article Article 🥬 LettuceDetect Goes Multilingual: Fine-tuning EuroBERT on Synthetic Translations By adaamko and 1 other • May 19 • 9
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 366
Deepseek Papers Collection Deepseek papers collection • 24 items • Updated about 8 hours ago • 273
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Jul 10 • 209
Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated Apr 10 • 100
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Jul 1 • 76
Nomic Embed Multimodal Collection Multimodal models allowing you to search over interleaved text, PDFs, charts, and images! • 16 items • Updated Jun 3 • 24
Ultravox v0.5 Collection Ultravox is a multimodal Speech LLM built around different pretrained LLMs (frozen) and the whisper-large-v3-turbo (fine-tuned) backbone. • 4 items • Updated 11 days ago • 19