InferenceIllusionist
/

dolphin-2.9.1-mixtral-1x22b-iMat-GGUF

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on May 24

Commit

80a45d6

•

1 Parent(s): ce4e2d1

Updated chunks size

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ tags:
 # dolphin-2.9.1-mixtral-1x22b-iMat-GGUF
 Quantized from fp16.
-* Weighted quantizations were creating using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in 234 chunks and n_ctx=512
 * This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
 * The enhancedv2-turbomini file appends snippets from turboderp's calibration data to the standard groups_merged.txt file
 * Repetition penalty 1.05-1.18 has worked well for these quants.

 # dolphin-2.9.1-mixtral-1x22b-iMat-GGUF
 Quantized from fp16.
+* Weighted quantizations were creating using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in 228 chunks and n_ctx=512
 * This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
 * The enhancedv2-turbomini file appends snippets from turboderp's calibration data to the standard groups_merged.txt file
 * Repetition penalty 1.05-1.18 has worked well for these quants.