DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models By AI-MO and 18 others • Jul 10 • 53
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 141
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data Jun 3 • 278
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published Jun 1 • 38
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 185
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning Paper • 2505.16400 • Published May 22 • 35
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 185
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning Paper • 2504.13820 • Published Apr 18 • 16
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 135
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Paper • 2411.02359 • Published Nov 4, 2024 • 13
How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published Nov 4, 2024 • 34
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Paper • 2407.08770 • Published Jul 11, 2024 • 21