mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated 6 days ago • 68k • • 143 jhu-clsp/mmBERT-small Fill-Mask • Updated 6 days ago • 8.89k • • 49 jhu-clsp/mmBERT-checkpoints Updated Sep 9 • 2 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated 29 days ago • 4.48k • 4
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 28 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 4.3k • 8 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 3.5k • • 5 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 436 • 3
mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated 6 days ago • 68k • • 143 jhu-clsp/mmBERT-small Fill-Mask • Updated 6 days ago • 8.89k • • 49 jhu-clsp/mmBERT-checkpoints Updated Sep 9 • 2 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated 29 days ago • 4.48k • 4
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 28 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 4.3k • 8 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 3.5k • • 5 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 436 • 3