MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models Paper • 2602.10934 • Published 11 days ago • 49
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 13 days ago • 152
SPARKLING: Balancing Signal Preservation and Symmetry Breaking for Width-Progressive Learning Paper • 2602.02472 • Published 19 days ago • 44
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 19 days ago • 76
AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts Paper • 2601.20730 • Published 25 days ago • 19
TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization Paper • 2601.16480 • Published 30 days ago • 52
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development Paper • 2601.11077 • Published Jan 16 • 65
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published Jan 4 • 57
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published Dec 8, 2025 • 59
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction Paper • 2512.04987 • Published Dec 4, 2025 • 80
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO Paper • 2511.16669 • Published Nov 20, 2025 • 32
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions Paper • 2511.03334 • Published Nov 5, 2025 • 53
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 240
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 72