fp8 version
#21
by
MayensGuds
- opened
quants would be amazing <3, if not fancy quants are available it'd be nice to have fp8 version pls <3
I already saw an Int8 version in this video, and it can run with as little as 6GB of VRAM: https://youtu.be/nur4_b4yzM0?t=421
I already saw an Int8 version in this video, and it can run with as little as 6GB of VRAM: https://youtu.be/nur4_b4yzM0?t=421
it can be loaded as fp8 but the weights wills till be bf16 or fp16 (i don't which one is the current weights)
Update! check my repo i quant the weights
Update! check my repo i quant the weights
How does the model perform with the weight quantised?