The official datasets and model checkpoints of ARPO
KABI
dongguanting
AI & ML interests
Reasoning and Alignment for Large Language Models
Recent Activity
upvoted
a
paper
about 21 hours ago
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn
Reinforcement Learning
upvoted
a
paper
about 21 hours ago
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs
via Bi-Mode Annealing and Reinforce Learning
upvoted
a
paper
about 21 hours ago
A Survey of Scientific Large Language Models: From Data Foundations to
Agent Frontiers