How can I quantize this model?

#7
by dantepalacio - opened

I don't have enough 32 vram to run in half-precision from main.py, how can I run it in 8 bit for example?

Sign up or log in to comment