BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 37
DistilBERT release Collection Original DistilBERT model, checkpoints obtained from using teacher-student learning from the original BERT checkpoints. • 6 items • Updated Apr 17, 2024 • 38
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 Aug 8, 2025 • 93
Latent Adversarial Regularization for Offline Preference Optimization Paper • 2601.22083 • Published Jan 29 • 13
SWE-RL Collection solving Github issues with Agentless scaffold and RL • 3 items • Updated Aug 11, 2025 • 1
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19, 2025 • 45
Domain specific data and model documentation Collection There is a growing number of datasheets or model card frameworks being proposed for particular domains. This collection tries to capture some of these • 6 items • Updated Oct 5, 2023 • 3
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 39
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Paper • 2409.02813 • Published Sep 4, 2024 • 33
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI Paper • 2404.16006 • Published Apr 24, 2024 • 2