Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning Paper • 2410.00255 • Published Sep 30 • 5
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time Paper • 2408.13233 • Published Aug 23 • 20
SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model Paper • 2403.13064 • Published Mar 19 • 31
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper • 2403.14624 • Published Mar 21 • 51
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 188
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation Paper • 2402.17245 • Published Feb 27 • 10
Sora Generates Videos with Stunning Geometrical Consistency Paper • 2402.17403 • Published Feb 27 • 16
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting Paper • 2312.13271 • Published Dec 20, 2023 • 4
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Paper • 2312.12491 • Published Dec 19, 2023 • 69
How FaR Are Large Language Models From Agents with Theory-of-Mind? Paper • 2310.03051 • Published Oct 4, 2023 • 34
Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 33