brucethemoose
commited on
Commit
•
ffdfe3b
1
Parent(s):
034e3bb
Update README.md
Browse files
README.md
CHANGED
@@ -16,8 +16,11 @@ https://github.com/yule-BUAA/MergeLM
|
|
16 |
|
17 |
https://github.com/cg123/mergekit/tree/dare'
|
18 |
|
19 |
-
24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2. I go into more detail in this [Reddit post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/), and recommend exl2 quantizations on data similar to the desired task
|
20 |
|
|
|
|
|
|
|
21 |
***
|
22 |
|
23 |
Merged with the following config, and the tokenizer from chargoddard's Yi-Llama:
|
|
|
16 |
|
17 |
https://github.com/cg123/mergekit/tree/dare'
|
18 |
|
19 |
+
24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2. I go into more detail in this [Reddit post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/), and recommend exl2 quantizations on data similar to the desired task, such as these targeted at fiction:
|
20 |
|
21 |
+
[4.0bpw](https://huggingface.co/brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties-exl2-4bpw-fiction)
|
22 |
+
|
23 |
+
[3.1bpw](https://huggingface.co/brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties-exl2-3.1bpw-fiction)
|
24 |
***
|
25 |
|
26 |
Merged with the following config, and the tokenizer from chargoddard's Yi-Llama:
|