brucethemoose commited on
Commit
2f3eb91
1 Parent(s): 681df06

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -6,6 +6,8 @@ language:
6
  - en
7
  library_name: transformers
8
  pipeline_tag: text-generation
 
 
9
  ---
10
 
11
  **NousResearch/Nous-Capybara-34B**, **migtissera/Tess-M-v1.3** and **bhenrym14/airoboros-3_1-yi-34b-200k** merged with a new, experimental implementation of "dare ties" via mergekit. See:
@@ -18,7 +20,7 @@ https://github.com/cg123/mergekit/tree/dare'
18
  ***
19
 
20
 
21
- 24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2. I go into more detail in this [Reddit post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/), and recommend exl2 quantizations on data similar to the desired task, such as these targeted at fiction: [4.0bpw](https://huggingface.co/brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties-exl2-4bpw-fiction) [3.1bpw](https://huggingface.co/brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties-exl2-3.1bpw-fiction)
22
  ***
23
 
24
  Merged with the following config, and the tokenizer from chargoddard's Yi-Llama:
 
6
  - en
7
  library_name: transformers
8
  pipeline_tag: text-generation
9
+ tags:
10
+ - text-generation-inference
11
  ---
12
 
13
  **NousResearch/Nous-Capybara-34B**, **migtissera/Tess-M-v1.3** and **bhenrym14/airoboros-3_1-yi-34b-200k** merged with a new, experimental implementation of "dare ties" via mergekit. See:
 
20
  ***
21
 
22
 
23
+ 24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2. I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/), and recommend exl2 quantizations on data similar to the desired task, such as these targeted at story writing: [4.0bpw](https://huggingface.co/brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties-exl2-4bpw-fiction) / [3.1bpw](https://huggingface.co/brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties-exl2-3.1bpw-fiction)
24
  ***
25
 
26
  Merged with the following config, and the tokenizer from chargoddard's Yi-Llama: