Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought Paper • 2510.04230 • Published Oct 5 • 26
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published Oct 1 • 57
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research Paper • 2505.11855 • Published May 17 • 10
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning Paper • 2502.17407 • Published Feb 24 • 26
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published Feb 20 • 47
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published Jan 13 • 99
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 285
view article Article Navigating Korean LLM Research #2: Evaluation Tools By amphora • Oct 23, 2024 • 8
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 140
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19, 2024 • 78
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 126