NVFP4
#14
by
reneho
- opened
This is great, any chance of a NVFP4 quant for Blackwell GPU’s?
We are discussing this with NVIDIA
Please dont forget to make nvfp4 mlx quant for apple users. Thank you
Awesome
I've been trying every which way I can think of to quantize this to NVFP4 and so far no amount of hacking on llm-compressor or ModelOpt results in a good quant. The 3D MoE layers are not cooperating.
Please dont forget to make nvfp4 mlx quant for apple users. Thank you
You can use the int4 version for now, just in case you did not notice there is an int4 version