Thanks bartowski for the GGUFs!

#7
by ubergarm - opened

For folks running llama.cpp big thanks to @bartowski for releasing some GGUFs here: https://huggingface.co/bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF

I just tested and seems to be working both on CPU and GPU backends using both llama.cpp and ik_llama.cpp. I'll try to get some ik_llama.cpp quants out soon too for folks to try these new models out.

Cheers and thanks mistral ai team!

Does anyone happen to know if scalable softmax has been implemented in either of the llama.cpp implementations? I’m seeing scattered reports of poor output quality from l.cpp; trying to narrow down the culprit.

Sign up or log in to comment