bhenrym14 commited on
Commit
468225a
1 Parent(s): 794538c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -22,7 +22,7 @@ Pretraining took 10 hours. Finetuning took ~41 hours on 1x RTX 6000 Ada.
22
 
23
  ## How to Use
24
 
25
- The easiest way is to use the GPTQ weights (linked above) with [oobabooga text-generation-webui](https://github.com/oobabooga/text-generation-webui) and ExLlama. You'll need to set max_seq_len to 8192 and compress_pos_emb to 4. Otherwise use the transformers module.
26
 
27
  **IMPORTANT: To use these weights you'll need to patch in the appropriate RoPE scaling module. see: [replace_llama_rope_with_scaled_rope](https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_rope_scaled_monkey_patch-16k.py)**
28
 
 
22
 
23
  ## How to Use
24
 
25
+ The easiest way is to use the GPTQ weights (linked above) with [oobabooga text-generation-webui](https://github.com/oobabooga/text-generation-webui) and ExLlama. You'll need to set max_seq_len to 16384 and compress_pos_emb to 8. Otherwise use the transformers module.
26
 
27
  **IMPORTANT: To use these weights you'll need to patch in the appropriate RoPE scaling module. see: [replace_llama_rope_with_scaled_rope](https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_rope_scaled_monkey_patch-16k.py)**
28