Add quantization examples using torchao and quanto
#2
by
a-r-r-o-w
HF staff
- opened
Hey, I'm Aryan from the Diffusers team π
Congratulations on the release of CogVideoX-5B!
It would be great to showcase some examples on how the quantized inference (int8
, and other datatypes) can be run to lower memory requirements by using TorchAO and Quanto, especially since we mention it in the model card table. Feel free to modify the code/wording/URLs in whichever way you see best fit. Could we do it for the chinese README, CogVideoX-2B and CogVideo GitHub repo as well? Thanks!
zRzRzRzRzRzRzR
changed pull request status to
merged