LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations Paper • 2509.03405 • Published 19 days ago • 23
sentence-transformers/embeddinggemma-300m-medical Sentence Similarity • 0.3B • Updated 18 days ago • 1.51k • • 17
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated 13 days ago • 37
GLiNER-decoder Collection A joint encoder-decoder GLiNER model for a scalable open-ontology entity recognition • 3 items • Updated Aug 15 • 16