Epiculous commited on
Commit
878cf39
1 Parent(s): a7c8181

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -136
README.md CHANGED
@@ -1,152 +1,45 @@
1
  ---
2
  library_name: transformers
3
- license: llama3.1
4
- base_model: Epiculous/NovaSpark-Instruct
 
5
  tags:
6
  - generated_from_trainer
 
 
 
 
 
 
 
 
 
7
  model-index:
8
- - name: outputs/NovaSpark_RP/5e-6_WD0.05_Waup8
9
  results: []
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
16
- <details><summary>See axolotl config</summary>
17
-
18
- axolotl version: `0.4.1`
19
- ```yaml
20
- base_model: Epiculous/NovaSpark-Instruct
21
- model_type: AutoModelForCausalLM
22
- tokenizer_type: AutoTokenizer
23
-
24
- plugins:
25
- - axolotl.integrations.liger.LigerPlugin
26
- liger_rope: true
27
- liger_rms_norm: true
28
- liger_swiglu: true
29
- liger_fused_linear_cross_entropy: true
30
-
31
- load_in_8bit: false
32
- load_in_4bit: false
33
- strict: false
34
-
35
- datasets:
36
- - path: datasets/Crimson_Dawn-v0.2/RP/SynthRP-Gens_processed_09-25-2024_converted_filtered-deduplicated_deslopped-classified.jsonl
37
- type: sharegpt
38
- conversation: llama3
39
- - path: datasets/Crimson_Dawn-v0.2/RP/stheno_data_filtered_v1.1_instruct_killed_processed_converted_filtered-deduplicated_deslopped-classified.jsonl
40
- type: sharegpt
41
- conversation: llama3
42
- - path: datasets/Crimson_Dawn-v0.2/RP/sonnet35-charcard-roleplay-sharegpt_processed_converted_filtered-deduplicated_deslopped-classified.jsonl
43
- type: sharegpt
44
- conversation: llama3
45
- - path: datasets/Crimson_Dawn-v0.2/RP/roleplay-deduped_processed_converted_filtered-deduplicated_deslopped-classified.jsonl
46
- type: sharegpt
47
- conversation: llama3
48
- dataset_prepared_path: last_run_prepared
49
- val_set_size: 0.01
50
- output_dir: ./outputs/NovaSpark_RP/5e-6_WD0.05_Waup8
51
 
52
- chat_template: llama3
53
- default_system_message: "You will will take whatever role the user gives you and act accordingly."
54
 
55
- sequence_len: 16384
56
- sample_packing: true
57
- eval_sample_packing: false
58
- shuffle_merged_datasets: true
59
- pad_to_sequence_len: false
60
 
61
- wandb_project: NovaSpark_RP
62
- wandb_name: 5e-6_WD0.05_Waup8
63
-
64
- gradient_accumulation_steps: 16
65
- micro_batch_size: 1
66
- num_epochs: 2
67
- optimizer: paged_adamw_8bit
68
- lr_scheduler: cosine
69
- learning_rate: 5e-6
70
-
71
- train_on_inputs: false
72
- group_by_length: false
73
- bf16: auto
74
- tf32: false
75
-
76
- gradient_checkpointing: unsloth
77
- gradient_checkpointing_kwargs:
78
- use_reentrant: false
79
- logging_steps: 1
80
- flash_attention: true
81
- eager_attention: false
82
-
83
- warmup_steps: 8
84
- evals_per_epoch: 4
85
- saves_per_epoch: 1
86
- debug: true
87
- weight_decay: 0.05
88
-
89
- special_tokens:
90
- pad_token: <|finetune_right_pad_id|>
91
- eos_token: <|eot_id|>
92
  ```
 
93
 
94
- </details><br>
95
-
96
- # outputs/NovaSpark_RP/5e-6_WD0.05_Waup8
97
-
98
- This model is a fine-tuned version of [Epiculous/NovaSpark-Instruct](https://huggingface.co/Epiculous/NovaSpark-Instruct) on the None dataset.
99
- It achieves the following results on the evaluation set:
100
- - Loss: 1.1786
101
-
102
- ## Model description
103
-
104
- More information needed
105
-
106
- ## Intended uses & limitations
107
-
108
- More information needed
109
-
110
- ## Training and evaluation data
111
-
112
- More information needed
113
-
114
- ## Training procedure
115
-
116
- ### Training hyperparameters
117
-
118
- The following hyperparameters were used during training:
119
- - learning_rate: 5e-06
120
- - train_batch_size: 1
121
- - eval_batch_size: 1
122
- - seed: 42
123
- - distributed_type: multi-GPU
124
- - num_devices: 2
125
- - gradient_accumulation_steps: 16
126
- - total_train_batch_size: 32
127
- - total_eval_batch_size: 2
128
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
129
- - lr_scheduler_type: cosine
130
- - lr_scheduler_warmup_steps: 8
131
- - num_epochs: 2
132
-
133
- ### Training results
134
-
135
- | Training Loss | Epoch | Step | Validation Loss |
136
- |:-------------:|:------:|:----:|:---------------:|
137
- | 1.3967 | 0.0194 | 1 | 1.3736 |
138
- | 1.3798 | 0.2518 | 13 | 1.3047 |
139
- | 1.2887 | 0.5036 | 26 | 1.2358 |
140
- | 1.2515 | 0.7554 | 39 | 1.2067 |
141
- | 1.2042 | 1.0048 | 52 | 1.1901 |
142
- | 1.0871 | 1.2560 | 65 | 1.1849 |
143
- | 1.1356 | 1.5072 | 78 | 1.1802 |
144
- | 1.139 | 1.7585 | 91 | 1.1786 |
145
 
 
 
146
 
147
- ### Framework versions
 
148
 
149
- - Transformers 4.45.1
150
- - Pytorch 2.3.0+cu121
151
- - Datasets 2.21.0
152
- - Tokenizers 0.20.0
 
 
1
  ---
2
  library_name: transformers
3
+ license: apache-2.0
4
+ base_model:
5
+ - grimjim/Llama-3.1-SuperNova-Lite-lorabilterated-8B
6
  tags:
7
  - generated_from_trainer
8
+ datasets:
9
+ - Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned
10
+ - anthracite-org/stheno-filtered-v1.1
11
+ - PJMixers/hieunguyenminh_roleplay-deduped-ShareGPT
12
+ - Gryphe/Sonnet3.5-Charcard-Roleplay
13
+ - Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned
14
+ - anthracite-org/kalo-opus-instruct-22k-no-refusal
15
+ - anthracite-org/nopm_claude_writing_fixed
16
+ - anthracite-org/kalo_opus_misc_240827
17
  model-index:
18
+ - name: Epiculous/NovaSpark
19
  results: []
20
  ---
21
 
22
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64adfd277b5ff762771e4571/pnFt8anKzuycrmIuB-tew.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
+ # Quants!
25
+ <strong>full</strong> / [exl2]() / [gguf]()
26
 
27
+ ## Prompting
28
+ This model is trained on llama instruct template, the prompting structure goes a little something like this:
 
 
 
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ```
31
+ <|begin_of_text|><|start_header_id|>system<|end_header_id|>
32
 
33
+ {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
+ {prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
36
+ ```
37
 
38
+ ### Context and Instruct
39
+ This model is trained on llama-instruct, please use that Context and Instruct template.
40
 
41
+ ### Current Top Sampler Settings
42
+ [Smooth Creativity](https://files.catbox.moe/0ihfir.json): Credit to Juelsman for researching this one!<br/>
43
+ [Variant Chimera](https://files.catbox.moe/h7vd45.json): Credit to Numbra!<br/>
44
+ [Spicy_Temp](https://files.catbox.moe/9npj0z.json) <br/>
45
+ [Violet_Twilight-Nitral-Special](https://files.catbox.moe/ot54u3.json) <br/>