intervitens commited on
Commit
b6e73d3
1 Parent(s): 813b3b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -11,6 +11,16 @@ datasets:
11
  - lemonilia/LimaRP
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
14
  # Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss
15
 
16
  Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) using [Charles Goddard](https://huggingface.co/chargoddard)'s ZLoss and Megablocks-based fork of transformers, and then fused to [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) at 0.5 weight.
 
11
  - lemonilia/LimaRP
12
  ---
13
 
14
+ Quantized using 200 samples of 8192 tokens from an RP-oriented [PIPPA](https://huggingface.co/datasets/royallab/PIPPA-cleaned) dataset.
15
+
16
+ Requires ExllamaV2 version 0.0.11 and up.
17
+
18
+ Original model link: [Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss](https://huggingface.co/Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss)
19
+
20
+ Original model README below.
21
+
22
+ ***
23
+
24
  # Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss
25
 
26
  Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) using [Charles Goddard](https://huggingface.co/chargoddard)'s ZLoss and Megablocks-based fork of transformers, and then fused to [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) at 0.5 weight.