output_dir / README.md
Aravindan's picture
End of training
4c054f1 verified
metadata
license: mit
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: Aravindan/gpt2out
datasets:
  - generator
model-index:
  - name: output_dir
    results: []

output_dir

This model is a fine-tuned version of Aravindan/gpt2out on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9619

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 80
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.6318 0.0147 30 2.4202
2.5147 0.0294 60 2.3425
2.4599 0.0440 90 2.2838
2.4009 0.0587 120 2.2386
2.394 0.0734 150 2.1971
2.3459 0.0881 180 2.1614
2.3057 0.1027 210 2.1324
2.3085 0.1174 240 2.1076
2.2675 0.1321 270 2.0891
2.2348 0.1468 300 2.0716
2.2167 0.1614 330 2.0594
2.1827 0.1761 360 2.0481
2.2049 0.1908 390 2.0390
2.1803 0.2055 420 2.0303
2.1709 0.2201 450 2.0250
2.1915 0.2348 480 2.0183
2.1583 0.2495 510 2.0120
2.168 0.2642 540 2.0072
2.1678 0.2788 570 2.0026
2.1545 0.2935 600 1.9988
2.1561 0.3082 630 1.9941
2.1442 0.3229 660 1.9913
2.1393 0.3375 690 1.9867
2.1489 0.3522 720 1.9834
2.1304 0.3669 750 1.9814
2.1175 0.3816 780 1.9783
2.113 0.3962 810 1.9753
2.1025 0.4109 840 1.9729
2.1181 0.4256 870 1.9711
2.0947 0.4403 900 1.9688
2.0868 0.4549 930 1.9665
2.1061 0.4696 960 1.9638
2.1096 0.4843 990 1.9619

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1