--- library_name: transformers base_model: - meta-llama/Llama-3.3-70B-Instruct tags: - generated_from_trainer model-index: - name: 70B-L3.3-mhnnn-x1 results: [] license: llama3.3 --- ![yeah](https://huggingface.co/Sao10K/70B-L3.3-mhnnn-x1/resolve/main/Huh.jpg) *my mental when things do not go well* # 70B-L3.3-mhnnn-x1 I quite liked it, after messing around. Same data composition as Freya, applied differently. Has occasional brainfarts which are fixed with a regen, the price for more creative outputs. Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.* ``` Prompt Format: Llama-3-Instruct Temperature: 1.1 min_p: 0.05 ``` Types of Data included within Sets ``` Completion - Novels / eBooks Text Adventure - Include details like 'Text Adventure Narrator' in the System Prompt, give it a one-shot example and it'll fly. Amoral Assistant - Include the terms 'Amoral', 'Neutral' along with the regular assistant prompt for better results. Instruct / Assistant - The usual assistant tasks. Roleplay - As per Usual, Regular Sets ``` Training time in total was ~14 Hours on a 8xH100 Node, shout out to SCDF for not sponsoring this run. My funds are dry doing random things. https://sao10k.carrd.co/ for contact. --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.6.0` ```yaml adapter: lora # 16-bit lora_r: 64 lora_alpha: 64 lora_dropout: 0.2 peft_use_rslora: true lora_target_linear: true # Data dataset_prepared_path: dataset_run_freya datasets: # S1 - Writing / Completion - path: datasets/eBooks-cleaned-75K type: completion - path: datasets/novels-clean-dedupe-10K type: completion # S2 - Instruct - path: datasets/10k-amoral-full-fixed-sys.json type: chat_template chat_template: llama3 roles_to_train: ["gpt"] field_messages: conversations message_field_role: from message_field_content: value train_on_eos: turn - path: datasets/44k-hespera-smartshuffle.json type: chat_template chat_template: llama3 roles_to_train: ["gpt"] field_messages: conversations message_field_role: from message_field_content: value train_on_eos: turn - path: datasets/5k_rpg_adventure_instruct-sys.json type: chat_template chat_template: llama3 roles_to_train: ["gpt"] field_messages: conversations message_field_role: from message_field_content: value train_on_eos: turn shuffle_merged_datasets: true warmup_ratio: 0.1 plugins: - axolotl.integrations.liger.LigerPlugin liger_rope: true liger_rms_norm: true liger_layer_norm: true liger_glu_activation: true liger_fused_linear_cross_entropy: true # Iterations num_epochs: 1 # Sampling sample_packing: true pad_to_sequence_len: true train_on_inputs: false group_by_length: false # Batching gradient_accumulation_steps: 4 micro_batch_size: 2 gradient_checkpointing: unsloth # Evaluation val_set_size: 0.025 evals_per_epoch: 5 eval_table_size: eval_max_new_tokens: 256 eval_sample_packing: false eval_batch_size: 1 # Optimizer optimizer: paged_ademamix_8bit lr_scheduler: cosine learning_rate: 0.00000242 weight_decay: 0.2 max_grad_norm: 10.0 # Garbage Collection gc_steps: 10 # Misc deepspeed: ./deepspeed_configs/zero3_bf16.json ```