fp8 version

#21

by MayensGuds - opened Nov 27, 2024

Nov 27, 2024

quants would be amazing <3, if not fancy quants are available it'd be nice to have fp8 version pls <3

Nov 28, 2024

I already saw an Int8 version in this video, and it can run with as little as 6GB of VRAM: https://youtu.be/nur4_b4yzM0?t=421

Nov 28, 2024

I already saw an Int8 version in this video, and it can run with as little as 6GB of VRAM: https://youtu.be/nur4_b4yzM0?t=421

it can be loaded as fp8 but the weights wills till be bf16 or fp16 (i don't which one is the current weights)

Nov 28, 2024

Update! check my repo i quant the weights

Dec 5, 2024

Update! check my repo i quant the weights

How does the model perform with the weight quantised?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment