gobean
/

Mixtral-8x7B-Instruct-v0.1-GGUF

Inference Endpoints

Model card Files Files and versions Community

gobean commited on Apr 18

Commit

b2c06e8

•

1 Parent(s): e61eda9

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -1,12 +1,14 @@
 ---
 license: apache-2.0
 ---
-Update: Someone requested q4_0, q5_0, and q6_k. Added, and q5_0 is my new favorite for this and any Mixtral derivative. Try it. Something about the 'k' process ever so slightly alters mixtrals. Compare if you don't believe me.
 These are the quantized GGUF files for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
-They were converted from Mistral's safetensors and quantized on April 3, 2024.
 This matters because some of the GGUF files for Mixtral 8x7B were created as soon as llama.cpp supported MoE architecture, but there were still bugs at that time.
 Those bugs have since been patched.

 ---
 license: apache-2.0
 ---
+Update: User (@concendo) asked if these were pre/post the 4/3 update to llama.cpp, everything was reqauntized with 4/18 version of llama.cpp since I wasn't sure.
+Note: qx-k-m quants are not as good as the qx-0, something about the 'k' process doesn't play nice with mixtral.
 These are the quantized GGUF files for [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
+They were converted from Mistral's safetensors and quantized on April 18, 2024.
 This matters because some of the GGUF files for Mixtral 8x7B were created as soon as llama.cpp supported MoE architecture, but there were still bugs at that time.
 Those bugs have since been patched.