Any chance of a GGUF for this?
It would be great to have a GGUF for this which would enable everyone running llama.cpp and projects that use it (e.g. Ollama, Jan, Msty, LM Studio etc...) along with others such as MistralRS to run the model.
Thank you for your interest in the Phi-3.5 models! We are exciting to see contributions from the community.
Thank you for your interest in the Phi-3.5 models! We are exciting to see contributions from the community.
Even SLMs like Phi-3/Phi-3.5 can run on ONNX runtime but still do not support open-source platforms like llama.cpp
. Some Devices/OS not supported in time like Mac OS (like x86 arch) can not get the ONNX runtime (like onnxruntime
, onnxruntime-genai
python lib, etc.). The techs like LongLoPE
, Falsh attention
, CLIP ViT
projector, etc need to be supported by open-source platforms like llama.cpp
ASAP. Microsoft needs to give hands to open-source communities to make it happen.