How can I quantize this model?
#7
by
dantepalacio
- opened
I don't have enough 32 vram to run in half-precision from main.py, how can I run it in 8 bit for example?
I don't have enough 32 vram to run in half-precision from main.py, how can I run it in 8 bit for example?