SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 1 day ago • 58
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 1 day ago • 58
Learning to Generate Unit Tests for Automated Debugging Paper • 2502.01619 • Published 3 days ago • 3
view post Post 1727 🚀 Introducing @huggingface Open Deep-Research💥In just 24 hours, we built an open-source agent that:✅ Autonomously browse the web✅ Search, scroll & extract info✅ Download & manipulate files✅ Run calculations on data55% on GAIA validation set! Help us improve it!💡https://huggingface.co/blog/open-deep-research See translation 1 reply · 🤗 4 4 🔥 2 2 🚀 2 2 + Reply
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper • 2501.12380 • Published 16 days ago • 81
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 23 days ago • 53
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages Paper • 2501.08284 • Published 23 days ago • 6
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning Paper • 2501.06590 • Published 26 days ago • 9
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 28 days ago • 87
view post Post 2036 Discover all the improvements in the new version of Lighteval: https://huggingface.co/docs/lighteval/ See translation 👀 4 4 🔥 1 1 + Reply
Multitask Prompted Training Enables Zero-Shot Task Generalization Paper • 2110.08207 • Published Oct 15, 2021 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 28
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 126
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs Paper • 2412.04144 • Published Dec 5, 2024 • 4