Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models Paper • 2312.06585 • Published Dec 11, 2023 • 28
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 70
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts Paper • 2402.09727 • Published Feb 15 • 36
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Paper • 2402.14083 • Published Feb 21 • 47
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 60
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 254
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition Paper • 2310.05492 • Published Oct 9, 2023 • 2
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines Paper • 2310.03714 • Published Oct 5, 2023 • 30
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability Paper • 2408.07852 • Published Aug 14 • 15
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models Paper • 2406.10162 • Published Jun 14
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9 • 34
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 16
Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations Paper • 2411.00640 • Published 24 days ago • 3