need gguf support
anyone from apple team are seeing this. please add a gguf format for this model .
thank you
I want that too
Hey hey @huntz47 & @sdyy - sorry for the delay in response. OpenELM is supported in llama.cpp!
I created some quants for the instruct models here:
450M - https://huggingface.co/reach-vb/OpenELM-450M-Instruct-Q8_0-GGUF
1.1B - https://huggingface.co/reach-vb/OpenELM-1_1B-Instruct-Q8_0-GGUF
3B - https://huggingface.co/reach-vb/OpenELM-3B-Instruct-Q8_0-GGUF
Note: I found quite a bit of degradation below Q8, but if you want to create other quants then feel free to use GGUF-my-repo space: https://huggingface.co/spaces/ggml-org/gguf-my-repo for it.
The inference instructions are in the model cards. Enjoy! and do let me know if you have any questions!