TheBloke commited on
Commit
cb5eec6
1 Parent(s): 55bdfaa

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -122,7 +122,10 @@ Refer to the Provided Files table below to see what files use which methods, and
122
 
123
  | Name | Quant method | Bits | Size | Max RAM required | Use case |
124
  | ---- | ---- | ---- | ---- | ---- | ----- |
125
- | [yi-34b-200k-dare-megamerge-v8.Q2_K.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q2_K.gguf) | Q2_K | 2 | 12.77 GB| 15.27 GB | smallest, significant quality loss - not recommended for most purposes |
 
 
 
126
  | [yi-34b-200k-dare-megamerge-v8.Q3_K_S.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q3_K_S.gguf) | Q3_K_S | 3 | 14.96 GB| 17.46 GB | very small, high quality loss |
127
  | [yi-34b-200k-dare-megamerge-v8.Q3_K_M.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q3_K_M.gguf) | Q3_K_M | 3 | 16.65 GB| 19.15 GB | very small, high quality loss |
128
  | [yi-34b-200k-dare-megamerge-v8.Q3_K_L.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q3_K_L.gguf) | Q3_K_L | 3 | 18.14 GB| 20.64 GB | small, substantial quality loss |
@@ -353,6 +356,8 @@ Being a Yi model, run a lower temperature with 0.05 or higher MinP, a little rep
353
 
354
  I recommend exl2 quantizations profiled on data similar to the desired task. It is especially sensitive to the quantization data at low bpw. I've upload my own fiction-oriented quantizations here: https://huggingface.co/collections/brucethemoose/most-recent-merge-65742644ca03b6c514afa204
355
 
 
 
356
  To load/train this in full-context backends like transformers, you *must* change `max_position_embeddings` in config.json to a lower value than 200,000, otherwise you will OOM! I do not recommend running high context without context-efficient backends like exllamav2, litellm or unsloth.
357
 
358
 
 
122
 
123
  | Name | Quant method | Bits | Size | Max RAM required | Use case |
124
  | ---- | ---- | ---- | ---- | ---- | ----- |
125
+ | [yi-34b-200k-dare-megamerge-v8.IQ2_XXS.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.IQ2_XXS.gguf) | IQ2_XXS | 2 | 9.31 GB| 11.81 GB | smallest size. 2.06 bpw. New IQuant method, Jan 2024 |
126
+ | [yi-34b-200k-dare-megamerge-v8.IQ2_XS.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.IQ2_XS.gguf) | IQ2_XS | 2 | 10.31 GB| 12.81 GB | second smallest size. 2.31 bpw quant. New IQuant method, Jan 2024 |
127
+ | [yi-34b-200k-dare-megamerge-v8.Q2_K_S.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q2_K_S.gguf) | Q2_K_S | 2 | 11.76 GB| 14.26 GB | significant quality loss - not recommended for most purposes. New method, Jan 2024 |
128
+ | [yi-34b-200k-dare-megamerge-v8.Q2_K.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q2_K.gguf) | Q2_K | 2 | 12.77 GB| 15.27 GB | significant quality loss - not recommended for most purposes |
129
  | [yi-34b-200k-dare-megamerge-v8.Q3_K_S.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q3_K_S.gguf) | Q3_K_S | 3 | 14.96 GB| 17.46 GB | very small, high quality loss |
130
  | [yi-34b-200k-dare-megamerge-v8.Q3_K_M.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q3_K_M.gguf) | Q3_K_M | 3 | 16.65 GB| 19.15 GB | very small, high quality loss |
131
  | [yi-34b-200k-dare-megamerge-v8.Q3_K_L.gguf](https://huggingface.co/TheBloke/Yi-34B-200K-DARE-megamerge-v8-GGUF/blob/main/yi-34b-200k-dare-megamerge-v8.Q3_K_L.gguf) | Q3_K_L | 3 | 18.14 GB| 20.64 GB | small, substantial quality loss |
 
356
 
357
  I recommend exl2 quantizations profiled on data similar to the desired task. It is especially sensitive to the quantization data at low bpw. I've upload my own fiction-oriented quantizations here: https://huggingface.co/collections/brucethemoose/most-recent-merge-65742644ca03b6c514afa204
358
 
359
+ Lonestriker has also uploaded more general purpose quantizations here: https://huggingface.co/models?sort=trending&search=LoneStriker+Yi-34B-200K-DARE-megamerge-v8
360
+
361
  To load/train this in full-context backends like transformers, you *must* change `max_position_embeddings` in config.json to a lower value than 200,000, otherwise you will OOM! I do not recommend running high context without context-efficient backends like exllamav2, litellm or unsloth.
362
 
363