SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations Paper • 2606.05563 • Published 6 days ago • 45
SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations Paper • 2606.05563 • Published 6 days ago • 45
Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators Paper • 2606.06476 • Published 6 days ago • 15
Watch, Remember, Reason: Human-View Video Understanding with MLLMs Paper • 2606.07433 • Published 5 days ago • 20
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 22 days ago • 104
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 22 days ago • 186
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published 27 days ago • 62
OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories Paper • 2605.04036 • Published May 5 • 69
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published May 8 • 69
CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models Paper • 2605.08735 • Published May 9 • 70
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published Apr 30 • 78
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published 28 days ago • 87
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published May 6 • 102
MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published 27 days ago • 118
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published May 7 • 112