Sao10K commited on
Commit
7656ef9
1 Parent(s): 7dc63d6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -0
README.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: llama3
4
+ base_model: meta-llama/Llama-3.3-70B-Instruct
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: L3.3-70B-Euryale-v2.3
9
+ results: []
10
+ ---
11
+
12
+ )
13
+
14
+ # L3.3-70B-Euryale-v2.3
15
+
16
+ A direct replacement / successor to Euryale v2.2, not Hanami-x1, though it is slightly better in my opinion.
17
+ Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
18
+ ```
19
+ Prompt Format: Llama-3-Instruct
20
+ Temperature: 1.1
21
+ min_p: 0.1
22
+ ```
23
+
24
+
25
+ Future-ish plans:
26
+ <br>\- Complete this model series.
27
+ <br>\- Further refine the Datasets used for quality, more secondary chats, more creative-related domains. (Inspired by Drummer)
28
+ <br>\- Work on my other incomplete projects. About half a dozen on the backburner for a while now.
29
+
30
+ Special thanks to my wallet for funding this, my juniors who share a single braincell between them, and my current national service.
31
+ <br>Have a good day, don't shit yourselves friends. I had a nasty call today.
32
+
33
+ Also sorry for the inactivity. Life was in the way. It still is, just less so, for now. Burnout is a thing, huh?
34
+
35
+ https://sao10k.carrd.co/ for contact.
36
+
37
+ ---
38
+
39
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
40
+ <details><summary>See axolotl config</summary>
41
+
42
+ axolotl version: `0.5.2`
43
+ ```yaml
44
+ base_model: meta-llama/Llama-3.3-70B-Instruct
45
+ model_type: AutoModelForCausalLM
46
+ tokenizer_type: AutoTokenizer
47
+
48
+ load_in_8bit: false
49
+ load_in_4bit: false
50
+ strict: false
51
+ sequence_len: 16384
52
+ bf16: auto
53
+ fp16:
54
+ tf32: false
55
+ flash_attention: true
56
+
57
+ adapter: lora
58
+ lora_model_dir:
59
+ lora_r: 128
60
+ lora_alpha: 16
61
+ lora_dropout: 0.1
62
+ lora_target_linear: true
63
+ lora_fan_in_fan_out:
64
+ peft_use_rslora: true
65
+
66
+ # Data
67
+ dataset_prepared_path: last_run_prepared
68
+ datasets:
69
+ - path: datasets/amoral-full-sys-prompt.json # Unalignment Data - Cleaned Up from Original, Split to its own file
70
+ type: customllama3
71
+ - path: datasets/mimi-superfix-RP-filtered-fixed.json # RP / Creative-Instruct Data
72
+ type: customllama3
73
+ - path: datasets/hespera-smartshuffle.json # Hesperus-v2-Instruct Data
74
+ type: customllama3
75
+ warmup_steps: 15
76
+
77
+ plugins:
78
+ - axolotl.integrations.liger.LigerPlugin
79
+ liger_rope: true
80
+ liger_rms_norm: true
81
+ liger_layer_norm: true
82
+ liger_glu_activation: true
83
+ liger_fused_linear_cross_entropy: true
84
+
85
+ # Iterations
86
+ num_epochs: 1
87
+
88
+ # Batching
89
+ gradient_accumulation_steps: 4
90
+ micro_batch_size: 1
91
+ gradient_checkpointing: "unsloth"
92
+
93
+ # Optimizer
94
+ optimizer: paged_ademamix_8bit
95
+ lr_scheduler: cosine
96
+ learning_rate: 0.000004
97
+ weight_decay: 0.1
98
+ max_grad_norm: 25.0
99
+
100
+ # Iterations
101
+ num_epochs: 1
102
+
103
+ # Misc
104
+ deepspeed: ./deepspeed_configs/zero3_bf16.json
105
+ ```
106
+
107
+ </details><br>