Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Abstract
As a fundamental backbone for video generation, diffusion models are challenged by low inference speed due to the sequential nature of denoising. Previous methods speed up the models by caching and reusing model outputs at uniformly selected timesteps. However, such a strategy neglects the fact that differences among model outputs are not uniform across timesteps, which hinders selecting the appropriate model outputs to cache, leading to a poor balance between inference efficiency and visual quality. In this study, we introduce Timestep Embedding Aware Cache (TeaCache), a training-free caching approach that estimates and leverages the fluctuating differences among model outputs across timesteps. Rather than directly using the time-consuming model outputs, TeaCache focuses on model inputs, which have a strong correlation with the modeloutputs while incurring negligible computational cost. TeaCache first modulates the noisy inputs using the timestep embeddings to ensure their differences better approximating those of model outputs. TeaCache then introduces a rescaling strategy to refine the estimated differences and utilizes them to indicate output caching. Experiments show that TeaCache achieves up to 4.41x acceleration over Open-Sora-Plan with negligible (-0.07% Vbench score) degradation of visual quality.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality (2024)
- Accelerating Vision Diffusion Transformers with Skip Branches (2024)
- SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers (2024)
- Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models (2024)
- Adaptive Caching for Faster Video Generation with Diffusion Transformers (2024)
- Fast and Memory-Efficient Video Diffusion Using Streamlined Inference (2024)
- Accelerating Diffusion Transformers with Token-wise Feature Caching (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper