Where GGUF?

by rdtfddgrffdgfdghfghdfujgdhgsf - opened Aug 6, 2024

Discussion

rdtfddgrffdgfdghfghdfujgdhgsf

Aug 6, 2024

smcleod

Aug 6, 2024

Yeah would be great if this could be applied to GGUF or EXL2 quantisation, GPTQ isn't very widely used anymore.

gghfez

Aug 6, 2024

Hadn't used gptq for like a year. When I tried this in the latest ooba, it produced garbage characters lol

ChenMnZ

Owner Aug 6, 2024

I am looking for the instruction and will give a try.

If anyone know how to transfer GPTQ models to GGUF or EXL2, please give me a help, Thank you!

BuildBackBuehler

Aug 18, 2024

Hmm, I wonder if there's a more/most-est universal format to transfer to? I've been trying to figure out how to run this with Silicon/Metal/MPS. The closest thing I'd found thus far is Mistral.rs -- the dev is lightning quick with updates it seems, and recently added in some GPTQ-type support FYI. IIRC, one would be able to convert the GPTQ model to GGUF/GGML, but if not, one can definitely run a GPTQ-quantized model on Mistral.rs, 2-bit, odd-bit, no problem...plus, it is Rust, which in itself means some boosts to performance/reliability!

Sadly, that support doesn't translate over to Metal just yet.

Metal/MPS can run w/ a Triton kernel now, so that is also another possibility if you really wanna open up your wonderful, awesome-sauce quants. to everyone 😉🤣 (just spitballing, was glazing over all this a week ago at this point, but I believe that is valid avenue as far as GPTQ-compatibility goes)

ChenMnZ

Owner Aug 18, 2024

@BuildBackBuehler
Thanks for your interesting.

Recently, T-MAC from MicroSoft have support to run quantized model of EfficientQAT. Additionally, the reported speed is even faster than llama.cpp.

MLDataScientist

Aug 19, 2024

Hello @ChenMnZ ,

Can we run this quantized model with T-MAC or Mistral.rs? Have you tried them with this 2-bit model? Thanks!

ChenMnZ

Owner Aug 20, 2024

@MLDataScientist The owner of T-MAC have tried this. You can refer https://github.com/OpenGVLab/EfficientQAT/issues/3#issuecomment-2298608707 for details.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment