FastVideo
/

FastHunyuan

Model card Files Files and versions Community

foreverpiano commited on Jan 1

Commit

725712e

·

verified ·

1 Parent(s): 6cb07f5

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -47,3 +47,10 @@ We provide some qualitative comparison between FastHunyuan 6 step inference v.s.
 | ![FastHunyuan 6 step](assets/distilled/3.gif) | ![Hunyuan 6 step](assets/undistilled/3.gif) |
 | ![FastHunyuan 6 step](assets/distilled/4.gif) | ![Hunyuan 6 step](assets/undistilled/4.gif) |

 | ![FastHunyuan 6 step](assets/distilled/3.gif) | ![Hunyuan 6 step](assets/undistilled/3.gif) |
 | ![FastHunyuan 6 step](assets/distilled/4.gif) | ![Hunyuan 6 step](assets/undistilled/4.gif) |
+## Memory requirements
+For inference, we can inference FastHunyuan on single RTX4090. We now support NF4 and LLM-INT8 quantized inference using BitsAndBytes for FastHunyuan. With NF4 quantization, inference can be performed on a single RTX 4090 GPU, requiring just 20GB of VRAM.
+For Lora Finetune, minimum hardware requirement
+- 40 GB GPU memory each for 2 GPUs with lora
+- 30 GB GPU memory each for 2 GPUs with CPU offload and lora.