Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.10893

Research and ideas

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 25 days ago • 88
A Survey on Diffusion Language Models

Paper • 2508.10875 • Published 21 days ago • 33
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published 27 days ago • 171
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

Paper • 2508.09968 • Published 22 days ago • 15

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 56
π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17 • 64
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second

Paper • 2507.10065 • Published Jul 14 • 24
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Paper • 2507.08776 • Published Jul 11 • 54

about 16 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 287 • 95
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 98
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

Arrexel/pattern-diffusion

Text-to-Image • Updated 27 days ago • 802 • 104
stepfun-ai/NextStep-1-Large

Text-to-Image • 15B • Updated 17 days ago • 3.92k • 90
facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated 16 days ago • 29k • 133
Skywork/Matrix-Game-2.0

Image-to-Video • Updated 15 days ago • 256

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 59
SeqTex: Generate Mesh Textures in Video Sequence

Paper • 2507.04285 • Published Jul 6 • 8
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 85
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published 21 days ago • 30

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 33
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 29
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14

Research and ideas

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 25 days ago • 88
A Survey on Diffusion Language Models

Paper • 2508.10875 • Published 21 days ago • 33
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published 27 days ago • 171
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

Paper • 2508.09968 • Published 22 days ago • 15

Arrexel/pattern-diffusion

Text-to-Image • Updated 27 days ago • 802 • 104
stepfun-ai/NextStep-1-Large

Text-to-Image • 15B • Updated 17 days ago • 3.92k • 90
facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated 16 days ago • 29k • 133
Skywork/Matrix-Game-2.0

Image-to-Video • Updated 15 days ago • 256

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 56
π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17 • 64
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second

Paper • 2507.10065 • Published Jul 14 • 24
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Paper • 2507.08776 • Published Jul 11 • 54

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 59
SeqTex: Generate Mesh Textures in Video Sequence

Paper • 2507.04285 • Published Jul 6 • 8
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 85
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published 21 days ago • 30

about 16 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 287 • 95
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 98
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Paper • 2503.10437 • Published Mar 13 • 33
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Paper • 2503.09642 • Published Mar 12 • 19
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14 • 29
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Paper • 2503.16422 • Published Mar 20 • 14

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs