MixSD: Mixed Contextual Self-Distillation for Knowledge Injection Paper • 2605.16865 • Published 14 days ago • 7
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 16 days ago • 145
Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers Paper • 2605.18106 • Published 12 days ago • 3
The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models Paper • 2605.06196 • Published 23 days ago • 9
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
Crowded in B-Space: Calibrating Shared Directions for LoRA Merging Paper • 2604.16826 • Published Apr 18 • 18
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 326
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published Mar 31 • 50
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published Mar 17 • 109
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 248