PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12 • 75
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12 • 75
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies Paper • 2508.08113 • Published Aug 11 • 11
From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens Paper • 2510.02292 • Published Oct 2 • 1
Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry Paper • 2510.25595 • Published Oct 29
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation Paper • 2511.01163 • Published Nov 3 • 31
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20 • 122
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28
Contrasting Adversarial Perturbations: The Space of Harmless Perturbations Paper • 2402.02095 • Published Feb 3, 2024
Can Vision Language Models Infer Human Gaze Direction? A Controlled Study Paper • 2506.05412 • Published Jun 4 • 4
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time Paper • 2506.18890 • Published Jun 23 • 6
Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models Paper • 2411.08733 • Published Nov 13, 2024 • 1
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published Jun 17 • 49