Sao10K
/

14B-Qwen2.5-Freya-x1

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Sao10K commited on Dec 31, 2024

Commit

82fc52e

·

verified ·

1 Parent(s): 854f7bc

Update README.md

Files changed (1) hide show

README.md +3 -5

README.md CHANGED Viewed

@@ -13,8 +13,6 @@ model-index:
 ![Kunou](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png)
-**Sister Versions for Lightweight and Heavyweight Use!**
 # 14B-Qwen2.5-Freya-v1
 I decided to mess around with training methods, considering the re-emegence of no longer used methods like multi-step training. Some people began doing it again, and so, why not? Inspired by LimaRP's methology but done it my way.
@@ -31,8 +29,8 @@ Freya-S2
 Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
 ```
 Prompt Format: ChatML
-Temperature: 1.1
-min_p: 0.1
 ```
 Training time in total was ~10 Hours on a 8xH100 Node, sponsored by the Government of Singapore or something. Thanks for the national service allowance, MHA.
@@ -118,7 +116,7 @@ liger_fused_linear_cross_entropy: true
 # Iterations
 num_epochs:
-- s1: 2
 - s2: 2
 # Sampling

 ![Kunou](https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png)
 # 14B-Qwen2.5-Freya-v1
 I decided to mess around with training methods, considering the re-emegence of no longer used methods like multi-step training. Some people began doing it again, and so, why not? Inspired by LimaRP's methology but done it my way.
 Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
 ```
 Prompt Format: ChatML
+Temperature: 1+ # I don't know, man.
+min_p: 0.05
 ```
 Training time in total was ~10 Hours on a 8xH100 Node, sponsored by the Government of Singapore or something. Thanks for the national service allowance, MHA.
 # Iterations
 num_epochs:
+- s1: 1
 - s2: 2
 # Sampling