MaLA-LM Collection MaLA-LM: Massive Language Adaptation of Large Language Models • 7 items • Updated Oct 7 • 1
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated 2 days ago • 45
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 7 items • Updated 5 days ago • 37
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • 11 days ago • 94
GLiClass Collection Generalist and Light-weighted Models for Zero-shot Text Classification • 13 items • Updated Sep 17 • 11
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 10 days ago • 273
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated about 1 month ago • 492
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jun 28 • 13
The first neural machine translation system for the Erzya language Paper • 2209.09368 • Published Sep 19, 2022 • 1
Seamless: Multilingual Expressive and Streaming Speech Translation Paper • 2312.05187 • Published Dec 8, 2023 • 13
WebInstruct 🌐 Embeddings 🧱 Models Collection A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses • 3 items • Updated Sep 4 • 11
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer Paper • 2404.04042 • Published Apr 5 • 2
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16 • 97