Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models Paper • 2508.05613 • Published Aug 7 • 17
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency Paper • 2508.05615 • Published Aug 7 • 21
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization Paper • 2507.15758 • Published Jul 21 • 34
Hierarchical Budget Policy Optimization for Adaptive Reasoning Paper • 2507.15844 • Published Jul 21 • 16
Double-Checker: Enhancing Reasoning of Slow-Thinking LLMs via Self-Critical Fine-Tuning Paper • 2506.21285 • Published Jun 26
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving Paper • 2502.12022 • Published Feb 17
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation Paper • 2506.03139 • Published Jun 3 • 17
Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? Paper • 2505.16998 • Published May 22 • 2
Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering Paper • 2402.14320 • Published Feb 22, 2024
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Paper • 2505.21500 • Published May 27 • 13
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models Paper • 2502.00334 • Published Feb 1
AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification Paper • 2503.01940 • Published Mar 3
Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper • 2505.14604 • Published May 20 • 23
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Paper • 2505.14684 • Published May 20 • 24
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Paper • 2505.15801 • Published May 21 • 17
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Paper • 2502.11684 • Published Feb 17 • 2
S$^3$c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners Paper • 2409.01524 • Published Sep 3, 2024 • 1
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning Paper • 2409.12929 • Published Sep 19, 2024 • 2