Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 4 days ago • 356
MotiMotion: Motion-Controlled Video Generation with Visual Reasoning Paper • 2605.22818 • Published 10 days ago • 3
A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook Paper • 2605.20266 • Published 13 days ago • 56
From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing Paper • 2605.15181 • Published 17 days ago • 12
The Scaling Properties of Implicit Deductive Reasoning in Transformers Paper • 2605.04330 • Published 26 days ago • 5
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer Paper • 2605.00503 • Published 30 days ago • 12
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 28 days ago • 166
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published Apr 24 • 227
A Systematic Study of Cross-Modal Typographic Attacks on Audio-Visual Reasoning Paper • 2604.03995 • Published Apr 5 • 4
MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published Apr 8 • 38
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 189