InferenceIllusionist
commited on
Commit
•
80a45d6
1
Parent(s):
ce4e2d1
Updated chunks size
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ tags:
|
|
25 |
# dolphin-2.9.1-mixtral-1x22b-iMat-GGUF
|
26 |
|
27 |
Quantized from fp16.
|
28 |
-
* Weighted quantizations were creating using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in
|
29 |
* This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
|
30 |
* The enhancedv2-turbomini file appends snippets from turboderp's calibration data to the standard groups_merged.txt file
|
31 |
* Repetition penalty 1.05-1.18 has worked well for these quants.
|
|
|
25 |
# dolphin-2.9.1-mixtral-1x22b-iMat-GGUF
|
26 |
|
27 |
Quantized from fp16.
|
28 |
+
* Weighted quantizations were creating using fp16 GGUF and [groups_merged-enhancedV2-TurboMini.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-9432658) in 228 chunks and n_ctx=512
|
29 |
* This method of calculating the importance matrix showed improvements in some areas for Mistral 7b and Llama3 8b models, see above post for details
|
30 |
* The enhancedv2-turbomini file appends snippets from turboderp's calibration data to the standard groups_merged.txt file
|
31 |
* Repetition penalty 1.05-1.18 has worked well for these quants.
|