fp8 version

#21
by MayensGuds - opened

quants would be amazing <3, if not fancy quants are available it'd be nice to have fp8 version pls <3

I already saw an Int8 version in this video, and it can run with as little as 6GB of VRAM: https://youtu.be/nur4_b4yzM0?t=421

I already saw an Int8 version in this video, and it can run with as little as 6GB of VRAM: https://youtu.be/nur4_b4yzM0?t=421

it can be loaded as fp8 but the weights wills till be bf16 or fp16 (i don't which one is the current weights)

Update! check my repo i quant the weights

Update! check my repo i quant the weights

How does the model perform with the weight quantised?

Sign up or log in to comment