DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking Paper • 2510.20168 • Published 14 days ago • 26
HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application Paper • 2510.19631 • Published 14 days ago • 26
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published 17 days ago • 91
Watch and Learn: Learning to Use Computers from Online Videos Paper • 2510.04673 • Published about 1 month ago • 10
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE Paper • 2510.13344 • Published 22 days ago • 61
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18 • 109
UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning Paper • 2509.11543 • Published Sep 15 • 47
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2 • 220
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2 • 123