Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Custom Quants for MistralAI Mistral Large v2 123b

IQ4_XXSR, basically IQ4_XS with attn_q in IQ3_S, and attn_v in Q6_K, and token_embed in Q6_0. Yes, you did read correctly, the last traditional quant of Ikawrakow, not available on Llama.cpp mainline.

WARNING : Compatible with IK_Llama.cpp and Croco.cpp (my fork of the great KoboldCpp) only. I'll release .exe soon, but it works already (at least on Windows) for those who can compile. https://github.com/Nexesenex/croco.cpp

Overall, maybe it's time for the Llama.cpp team to have a look at Ikawrakow's last work and offer terms of cooperation with him, so we can enjoy once again SOTA quants in Llama.cpp. https://github.com/ikawrakow/ik_llama.cpp

Because the situation is becoming grotesque : we are quantizing massively models with non-SOTA quants while there is better in reach. Thousands of terabytes of storage space, our compute and our time is wasted because of this situation.

Downloads last month
62
GGUF
Model size
123B params
Architecture
llama

16-bit

Inference API
Unable to determine this model's library. Check the docs .