Edit model card

llama381binstruct_summarize_short

This model is a fine-tuned version of NousResearch/Meta-Llama-3.1-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2237

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 30
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
1.822 1.1364 25 1.3842
0.883 2.2727 50 1.3566
0.4979 3.4091 75 1.5563
0.1678 4.5455 100 1.7178
0.088 5.6818 125 1.7917
0.0517 6.8182 150 1.8856
0.0407 7.9545 175 1.8559
0.0263 9.0909 200 1.9859
0.0192 10.2273 225 1.9196
0.0096 11.3636 250 2.0297
0.0112 12.5 275 2.0220
0.0033 13.6364 300 2.0524
0.0032 14.7727 325 2.1544
0.0024 15.9091 350 2.1657
0.0034 17.0455 375 2.1844
0.0021 18.1818 400 2.1901
0.0021 19.3182 425 2.2049
0.0021 20.4545 450 2.2152
0.0018 21.5909 475 2.2216
0.0021 22.7273 500 2.2237

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
1
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for niting089/llama381binstruct_summarize_short

Adapter
(69)
this model