No GGUF Quatization?
Is there a reason this does not get a GGUF quantization? Thanks for providing it in GPTQ I don't want to sound ungrateful. Thanks for the hard work TheBloke.
What does it take to GGUF export it
I didn't make GGUFs because I don't believe it's possible to use Llava with GGUF at this time. To get the image processing aspects, requires other components which are not supported in GGUF yet.
Actually llama.cpp/llava works with llava.gguf
LLaVA 1.5
- 7B - https://huggingface.co/mys/ggml_llava-v1.5-7b
- 13B - https://huggingface.co/mys/ggml_llava-v1.5-13b
BakLLaVA 1 - https://huggingface.co/mys/ggml_bakllava-1
Obsidian-3B-V0.5 - https://huggingface.co/nisten/obsidian-3b-multimodal-q6-gguf
Needs a llama.cpp fork tho (because StableLM-3B-4e1t base), instructions in model card.
Found this: https://huggingface.co/jartine/llava-v1.5-7B-GGUF
It's related to a llamafile project from Mozilla: