brucethemoose
commited on
Commit
•
034e3bb
1
Parent(s):
f115204
Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ https://github.com/yule-BUAA/MergeLM
|
|
16 |
|
17 |
https://github.com/cg123/mergekit/tree/dare'
|
18 |
|
19 |
-
24GB GPUs can run
|
20 |
|
21 |
***
|
22 |
|
|
|
16 |
|
17 |
https://github.com/cg123/mergekit/tree/dare'
|
18 |
|
19 |
+
24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2. I go into more detail in this [Reddit post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/), and recommend exl2 quantizations on data similar to the desired task.
|
20 |
|
21 |
***
|
22 |
|