Fizzarolli commited on
Commit
0c4f3fe
·
verified ·
1 Parent(s): 0ee9ea1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -3
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - allura-org/Teleut-7b
5
+ tags:
6
+ - roleplay
7
+ - conversational
8
+ ---
9
+ # Teleut 7b RP
10
+ [cute boygirlthing pending]
11
+
12
+ A roleplay-focused LoRA finetune of Teleut 7b. Methodology and hyperparams inspired by [SorcererLM](https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16).
13
+
14
+ ## Dataset
15
+ The worst mix of data you've ever seen. Like, seriously, you do not want to see the things that went into this model. It's bad.
16
+
17
+ ## Recommended Settings
18
+ Chat template: ChatML
19
+ Recommended samplers (not the be-all-end-all, try some on your own!):
20
+ - Temp 1.03 / TopK 200 / MinP 0.05 / TopA 0.2
21
+ - Temp 1.03 / TFS 0.75 / TopA 0.3
22
+
23
+ ## Hyperparams
24
+ General:
25
+ - Epochs = 2
26
+ - LR = 6e-5
27
+ - LR Scheduler = Cosine
28
+ - Optimizer = Paged AdamW 8bit
29
+ - Effective batch size = 12
30
+ LoRA:
31
+ - Rank = 16
32
+ - Alpha = 32
33
+ - Dropout = 0.25 (Inspiration: [Slush](https://huggingface.co/crestf411/Q2.5-32B-Slush))
34
+
35
+ ## Credits
36
+ Thanks to the people who created the data. I would credit you, but that would be cheating ;)
37
+ Thanks to all Allura members, especially Toasty, for testing and emotional support ilya /platonic
38
+ NO thanks to Infermatic. They suck at hosting models