EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents Paper • 2605.13841 • Published 20 days ago • 64
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics Paper • 2605.12178 • Published 21 days ago • 61
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published Mar 25 • 98