Hey :) LMstudio support and picking your brain
Hi. I did the 13B GGUF for these and was wondering if you would be so kind as to point me at the docs / script / something you used to compress the CLIP mmproj - I recall seeing something like that being available on some linux branch of something but for the life of me can't dredge it up.
Would really appreciate the assist.
In other news. To make this LMstudio compatible OOTB you might consider renaming the adaptors to mmproj-Q4_0.gguf (et al)
Thanks for this version.
Hey! I heard about your project for the first time a few days ago. It's pretty cool.
If you need to quantize the clip mmproj files, you can use the llamafile-llava-quantize-0.2.1
program from https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.2.1
All the code that's needed for doing this is in the llama.cpp project upstream. But no one wrote a main() function for it until I came along. I hope you find it useful. I'm very much forward to using LLaVA 13B when you post it, since I honestly have no idea how to quantize the other file!
Thanks very much. I'm also on the verge of releasing NOUS . again I will eventually be using the encoding quantiser when I get it working - in the interim Ill just post what I have.