SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 3 days ago • 49
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 3 days ago • 24
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 9 days ago • 277
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation Paper • 2501.12202 • Published 10 days ago • 31
GameFactory: Creating New Games with Generative Interactive Videos Paper • 2501.08325 • Published 17 days ago • 61
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published 17 days ago • 55
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 17 days ago • 271
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 22 days ago • 87
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published 23 days ago • 33
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 23 days ago • 85
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images Paper • 2501.04689 • Published 23 days ago • 17
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 23 days ago • 249
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 24 days ago • 23
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published 24 days ago • 67
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published 28 days ago • 42