Speed optimization is possible with low impact on vram (Teacache, Efficient Time Embeddings, batching forward passes)

#111

by Jd1911 - opened 6 days ago

Discussion

Jd1911

6 days ago

Hi Kijai

Hope all is well

Not sure how (and not educated enough) to know if these optimizations can be implemented.

The team at Voltage park did this experiment that resulted 3.1x speed up in generation time. We can leave the flash attention 3 out of it. But the other methods are super promising:

Efficient Time Embeddings
Batching Forward Passes
Intelligent Caching with TeaCache

It is not the standard TeaCache.

Do you think you can apply these through your wrapper?

More here: https://www.voltagepark.com/blog/accelerating-wan2-2-from-4-67s-to-1-5s-per-denoising-step-through-targeted-optimizations

Kijai

Owner 4 days ago

Hey,

Thanks for the link, but I don't really see anything new there, we've had all this in ComfyUI for a while, and TeaCache is pretty outdated by now with newer alternatives such as MagCache and EasyCache.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment