Enabling Scalable Oversight via Self-Evolving Critic Paper • 2501.05727 • Published 6 days ago • 62
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 3 days ago • 67
Qwen2.5-Math Collection Math-specific model series based on Qwen2.5 • 11 items • Updated 2 days ago • 62
Qwen2-Math Collection Math-specific model series based on Qwen2 • 8 items • Updated Nov 28, 2024 • 47
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Paper • 2407.04078 • Published Jul 4, 2024 • 18
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published Jun 19, 2024 • 16
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Paper • 2407.04078 • Published Jul 4, 2024 • 18
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published Jun 19, 2024 • 16