https://huggingface.co/meta-llama/Llama-3.2-90B-Vision-Instruct
#328
by
Cypherfox
- opened
Greetings,
I would really like this, especially the 8 and 6 bit quants. If that's not something that you can do (either because it's damn huge, or for geoblock reasons) if you can point me to the right programs that would do the generation, I can do it myself locally. I have decent hardware to do it.
I tried using the standard llama.cpp
conversion:
python llama.cpp/convert_hf_to_gguf.py meta-llama_Llama-3.2-90B-Vision-Instruct --outfile l32-90b-vi.gguf --outtype q8_0
and I got an error message:
INFO:hf-to-gguf:Loading model: meta-llama_Llama-3.2-90B-Vision-Instruct
ERROR:hf-to-gguf:Model MllamaForConditionalGeneration is not supported
Is this just not something supported yet (as it says) or am I using the wrong tools?
-- Morgan
It's just not something supported by llama.cpp yet, what you are trying should work when it is supported.,
mradermacher
changed discussion status to
closed
https://github.com/ggerganov/llama.cpp/issues/9663 is the bug ticket for support for llama.cpp. If support is added, you can remind me and I will quantize it.