ARPO The official datasets and model checkpoints of ARPO Agentic Reinforced Policy Optimization Paper • 2507.19849 • Published Jul 26 • 147 dongguanting/Qwen3-8B-ARPO-DeepSearch 8B • Updated Jul 29 • 31 • 1 dongguanting/Qwen3-14B-ARPO-DeepSearch Text Generation • 15B • Updated 24 days ago • 63 • 4 dongguanting/Qwen2.5-7B-ARPO Text Generation • 8B • Updated 17 days ago • 72 • 2
RAG-Critic dongguanting/RAG-Critic-3B Text Generation • 3B • Updated Jun 28 • 20 • 3 dongguanting/RAG-Error-Critic-100K Viewer • Updated Jun 28 • 100k • 19 • 2
Tool-Star Tool-Star is a reinforcement learning-based framework designed to empower LLMs to autonomously invoke multiple external tools during stepwise reasonin Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57 dongguanting/Tool-Star-SFT-54K Viewer • Updated May 29 • 54k • 268 • 8 dongguanting/Multi-Tool-RL-10K Viewer • Updated May 25 • 10k • 127 • 4 dongguanting/Tool-Star-Qwen-7B Text Generation • 8B • Updated Jun 30 • 20 • 2
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57
ARPO The official datasets and model checkpoints of ARPO Agentic Reinforced Policy Optimization Paper • 2507.19849 • Published Jul 26 • 147 dongguanting/Qwen3-8B-ARPO-DeepSearch 8B • Updated Jul 29 • 31 • 1 dongguanting/Qwen3-14B-ARPO-DeepSearch Text Generation • 15B • Updated 24 days ago • 63 • 4 dongguanting/Qwen2.5-7B-ARPO Text Generation • 8B • Updated 17 days ago • 72 • 2
Tool-Star Tool-Star is a reinforcement learning-based framework designed to empower LLMs to autonomously invoke multiple external tools during stepwise reasonin Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57 dongguanting/Tool-Star-SFT-54K Viewer • Updated May 29 • 54k • 268 • 8 dongguanting/Multi-Tool-RL-10K Viewer • Updated May 25 • 10k • 127 • 4 dongguanting/Tool-Star-Qwen-7B Text Generation • 8B • Updated Jun 30 • 20 • 2
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57
RAG-Critic dongguanting/RAG-Critic-3B Text Generation • 3B • Updated Jun 28 • 20 • 3 dongguanting/RAG-Error-Critic-100K Viewer • Updated Jun 28 • 100k • 19 • 2