-
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 88 -
A Survey on Diffusion Language Models
Paper • 2508.10875 • Published • 33 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 171 -
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
Paper • 2508.09968 • Published • 15
Collections
Discover the best community collections!
Collections including paper arxiv:2508.10893
-
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 56 -
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper • 2507.13347 • Published • 64 -
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Paper • 2507.10065 • Published • 24 -
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering
Paper • 2507.08776 • Published • 54
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 287 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models
Paper • 2506.19851 • Published • 59 -
SeqTex: Generate Mesh Textures in Video Sequence
Paper • 2507.04285 • Published • 8 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 85 -
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
Paper • 2508.10893 • Published • 30
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 33 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 29 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14
-
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 88 -
A Survey on Diffusion Language Models
Paper • 2508.10875 • Published • 33 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 171 -
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
Paper • 2508.09968 • Published • 15
-
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 56 -
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper • 2507.13347 • Published • 64 -
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Paper • 2507.10065 • Published • 24 -
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering
Paper • 2507.08776 • Published • 54
-
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models
Paper • 2506.19851 • Published • 59 -
SeqTex: Generate Mesh Textures in Video Sequence
Paper • 2507.04285 • Published • 8 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 85 -
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
Paper • 2508.10893 • Published • 30
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 287 • 95 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 33 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 29 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14