OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 4 days ago • 58
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published 4 days ago • 39
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation Paper • 2510.09116 • Published 11 days ago • 94
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning Paper • 2510.12693 • Published 7 days ago • 25
Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published 6 days ago • 24
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published 11 days ago • 46
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published 12 days ago • 28
Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models Paper • 2510.08492 • Published 12 days ago • 6
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training Paper • 2510.12586 • Published 7 days ago • 104
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 8 days ago • 163
Demystifying Reinforcement Learning in Agentic Reasoning Paper • 2510.11701 • Published 8 days ago • 30
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 8 days ago • 153