Best of Both Worlds: Advantages of Hybrid Graph Sequence Models Paper • 2411.15671 • Published 17 days ago • 7
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages Paper • 2411.16508 • Published 16 days ago • 7
Knowledge Transfer Across Modalities with Natural Language Supervision Paper • 2411.15611 • Published 18 days ago • 15
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published 16 days ago • 36
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published 16 days ago • 36
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published Oct 24 • 40
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22 • 88
On Memorization of Large Language Models in Logical Reasoning Paper • 2410.23123 • Published Oct 30 • 17
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks Paper • 2410.22391 • Published Oct 29 • 21
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 5 days ago • 634
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 5 days ago • 540