Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published 6 days ago • 35
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published 8 days ago • 47
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation Paper • 2410.21271 • Published 29 days ago • 6
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22 • 88
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Paper • 2410.01680 • Published Oct 2 • 32
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30 • 52
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day • 229 items • Updated about 3 hours ago • 28
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19 • 135