VideoSeeker: Incentivizing Instance-level Video Understanding via Native Agentic Tool Invocation Paper • 2605.16079 • Published 9 days ago • 27
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 13 days ago • 74
Flow-OPD: On-Policy Distillation for Flow Matching Models Paper • 2605.08063 • Published 16 days ago • 97
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 20 days ago • 335
All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models Paper • 2604.00479 • Published Apr 1 • 68