mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 52
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 25
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 3 days ago • 74
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published 20 days ago • 49
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 149
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 10 days ago • 469
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation Paper • 2601.15369 • Published Jan 21 • 21
OpenVision 3 Collection A Family of Unified Visual Encoder with Unified Visual Representation. • 4 items • Updated Jan 27 • 2
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 26 days ago • 82
view article Article We’re open-sourcing our text-to-image model and the process behind it Nov 12, 2025 • 85
view article Article Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model 25 days ago • 28