mradermacher/model_requests · https://huggingface.co/meta-llama/Llama-3.2-90B-Vision-Instruct

Sep 30

•

Greetings,
I would really like this, especially the 8 and 6 bit quants. If that's not something that you can do (either because it's damn huge, or for geoblock reasons) if you can point me to the right programs that would do the generation, I can do it myself locally. I have decent hardware to do it.

I tried using the standard llama.cpp conversion:

python llama.cpp/convert_hf_to_gguf.py meta-llama_Llama-3.2-90B-Vision-Instruct --outfile l32-90b-vi.gguf --outtype q8_0

and I got an error message:

INFO:hf-to-gguf:Loading model: meta-llama_Llama-3.2-90B-Vision-Instruct
ERROR:hf-to-gguf:Model MllamaForConditionalGeneration is not supported

Is this just not something supported yet (as it says) or am I using the wrong tools?

-- Morgan

mradermacher

Owner Sep 30

It's just not something supported by llama.cpp yet, what you are trying should work when it is supported.,

mradermacher changed discussion status to closed Sep 30

mradermacher

Owner Sep 30

https://github.com/ggerganov/llama.cpp/issues/9663 is the bug ticket for support for llama.cpp. If support is added, you can remind me and I will quantize it.