Update README.md
Browse files
README.md
CHANGED
@@ -13,8 +13,6 @@ model-index:
|
|
13 |
|
14 |
![Kunou](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png)
|
15 |
|
16 |
-
**Sister Versions for Lightweight and Heavyweight Use!**
|
17 |
-
|
18 |
# 14B-Qwen2.5-Freya-v1
|
19 |
|
20 |
I decided to mess around with training methods, considering the re-emegence of no longer used methods like multi-step training. Some people began doing it again, and so, why not? Inspired by LimaRP's methology but done it my way.
|
@@ -31,8 +29,8 @@ Freya-S2
|
|
31 |
Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
|
32 |
```
|
33 |
Prompt Format: ChatML
|
34 |
-
Temperature: 1.
|
35 |
-
min_p: 0.
|
36 |
```
|
37 |
|
38 |
Training time in total was ~10 Hours on a 8xH100 Node, sponsored by the Government of Singapore or something. Thanks for the national service allowance, MHA.
|
@@ -118,7 +116,7 @@ liger_fused_linear_cross_entropy: true
|
|
118 |
|
119 |
# Iterations
|
120 |
num_epochs:
|
121 |
-
- s1:
|
122 |
- s2: 2
|
123 |
|
124 |
# Sampling
|
|
|
13 |
|
14 |
![Kunou](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png)
|
15 |
|
|
|
|
|
16 |
# 14B-Qwen2.5-Freya-v1
|
17 |
|
18 |
I decided to mess around with training methods, considering the re-emegence of no longer used methods like multi-step training. Some people began doing it again, and so, why not? Inspired by LimaRP's methology but done it my way.
|
|
|
29 |
Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
|
30 |
```
|
31 |
Prompt Format: ChatML
|
32 |
+
Temperature: 1+ # I don't know, man.
|
33 |
+
min_p: 0.05
|
34 |
```
|
35 |
|
36 |
Training time in total was ~10 Hours on a 8xH100 Node, sponsored by the Government of Singapore or something. Thanks for the national service allowance, MHA.
|
|
|
116 |
|
117 |
# Iterations
|
118 |
num_epochs:
|
119 |
+
- s1: 1
|
120 |
- s2: 2
|
121 |
|
122 |
# Sampling
|