Sao10K commited on
Commit
82fc52e
·
verified ·
1 Parent(s): 854f7bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -13,8 +13,6 @@ model-index:
13
 
14
  ![Kunou](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png)
15
 
16
- **Sister Versions for Lightweight and Heavyweight Use!**
17
-
18
  # 14B-Qwen2.5-Freya-v1
19
 
20
  I decided to mess around with training methods, considering the re-emegence of no longer used methods like multi-step training. Some people began doing it again, and so, why not? Inspired by LimaRP's methology but done it my way.
@@ -31,8 +29,8 @@ Freya-S2
31
  Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
32
  ```
33
  Prompt Format: ChatML
34
- Temperature: 1.1
35
- min_p: 0.1
36
  ```
37
 
38
  Training time in total was ~10 Hours on a 8xH100 Node, sponsored by the Government of Singapore or something. Thanks for the national service allowance, MHA.
@@ -118,7 +116,7 @@ liger_fused_linear_cross_entropy: true
118
 
119
  # Iterations
120
  num_epochs:
121
- - s1: 2
122
  - s2: 2
123
 
124
  # Sampling
 
13
 
14
  ![Kunou](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png)
15
 
 
 
16
  # 14B-Qwen2.5-Freya-v1
17
 
18
  I decided to mess around with training methods, considering the re-emegence of no longer used methods like multi-step training. Some people began doing it again, and so, why not? Inspired by LimaRP's methology but done it my way.
 
29
  Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
30
  ```
31
  Prompt Format: ChatML
32
+ Temperature: 1+ # I don't know, man.
33
+ min_p: 0.05
34
  ```
35
 
36
  Training time in total was ~10 Hours on a 8xH100 Node, sponsored by the Government of Singapore or something. Thanks for the national service allowance, MHA.
 
116
 
117
  # Iterations
118
  num_epochs:
119
+ - s1: 1
120
  - s2: 2
121
 
122
  # Sampling