Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 10 days ago • 28
NeMo Curator - Classifier Models Collection Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated 1 day ago • 15
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 24 days ago • 68
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published Jan 9 • 37
DolphinLabeled Datasets Collection Eric Hartford has added labels to help you filter datasets, for your pleasure. • 5 items • Updated Jan 6 • 12
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published Dec 24, 2024 • 75
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 68 items • Updated 2 days ago • 104
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published Dec 19, 2024 • 33
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 132
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 134
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 92
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Paper • 2404.07544 • Published Apr 11, 2024 • 20