Added GGUF generation script and configuration, please brief note

Files changed (2) hide show

README.md CHANGED Viewed

@@ -4,4 +4,8 @@ This is a GGUF version of https://huggingface.co/PhilipMay/Phi-3-mini-4k-instruc
 The source model is an 8x MoE version of microsoft/Phi-3-mini-4k-instruct. It is based on the Llamafied version vonjack/Phi-3-mini-4k-instruct-LLaMAfied of Gan Feng.
-It was created with the help of mergekit.

 The source model is an 8x MoE version of microsoft/Phi-3-mini-4k-instruct. It is based on the Llamafied version vonjack/Phi-3-mini-4k-instruct-LLaMAfied of Gan Feng.
+It was created with the help of mergekit.
+I have included the gguf-imat.py script and imatrix\imatrix.txt configuration used for the conversion.  This is based on FantasiaFoundry/GGUF-Quantization-Script, and tweaked to pad vocab to allow operation with this model.
+This model has been tested to be functional with LlamaSharp, so should be compatible with any llama.cpp based solutions.

imatrix/imatrix.txt ADDED Viewed

The diff for this file is too large to render. See raw diff