-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71 -
SF-V: Single Forward Video Generation Model
Paper • 2406.04324 • Published • 23 -
VideoTetris: Towards Compositional Text-to-Video Generation
Paper • 2406.04277 • Published • 22 -
Vript: A Video Is Worth Thousands of Words
Paper • 2406.06040 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2406.04277
-
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 188 -
MusicHiFi: Fast High-Fidelity Stereo Vocoding
Paper • 2403.10493 • Published • 16 -
Music Consistency Models
Paper • 2404.13358 • Published • 12 -
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Paper • 2406.02430 • Published • 29
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 15 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 8 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 113 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 73 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33