S$^3$c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners Paper • 2409.01524 • Published Sep 3, 2024 • 1
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Paper • 2505.15801 • Published May 21 • 17
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Paper • 2502.11684 • Published Feb 17 • 2
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification Paper • 2506.04592 • Published Jun 5
SALT4Decompile: Inferring Source-level Abstract Logic Tree for LLM-Based Binary Decompilation Paper • 2509.14646 • Published Sep 18
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30 • 32
On Predictability of Reinforcement Learning Dynamics for Large Language Models Paper • 2510.00553 • Published Oct 1 • 8
Can We Verify Step by Step for Incorrect Answer Detection? Paper • 2402.10528 • Published Feb 16, 2024
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models Paper • 2502.00334 • Published Feb 1
UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models Paper • 2501.13766 • Published Jan 23
Advancing Multimodal Reasoning Capabilities of Multimodal Large Language Models via Visual Perception Reward Paper • 2506.07218 • Published Jun 8
GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling Paper • 2506.22049 • Published Jun 27 • 2
Double-Checker: Enhancing Reasoning of Slow-Thinking LLMs via Self-Critical Fine-Tuning Paper • 2506.21285 • Published Jun 26
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving Paper • 2502.12022 • Published Feb 17