gpt2out / README.md
Aravindan's picture
End of training
c1b230c verified
|
raw
history blame
2.65 kB
metadata
license: mit
base_model: Aravindan/gpt2out
tags:
  - generated_from_trainer
model-index:
  - name: gpt2coder-8epochs
    results: []

gpt2coder-8epochs

This model is a fine-tuned version of Aravindan/gpt2out on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9270

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 80
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss
No log 0.9810 31 3.2508
No log 1.9937 63 2.6920
No log 2.9747 94 2.3769
No log 3.9873 126 2.1444
No log 5.0 158 1.9673
No log 5.9810 189 1.8320
No log 6.9937 221 1.7097
No log 7.9747 252 1.6159
No log 8.9873 284 1.5231
No log 10.0 316 1.4535
No log 10.9810 347 1.3788
No log 11.9937 379 1.3109
No log 12.9747 410 1.2496
No log 13.9873 442 1.1989
No log 14.9810 465 1.1647
No log 15.9937 497 1.1208
1.3856 16.9747 528 1.0841
1.3856 17.9873 560 1.0464
1.3856 19.0 592 1.0180
1.3856 19.9810 623 0.9928
1.3856 20.9937 655 0.9689
1.3856 21.9747 686 0.9517
1.3856 22.9873 718 0.9390
1.3856 24.0 750 0.9298
1.3856 24.7911 775 0.9270

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.1.2
  • Datasets 2.19.1
  • Tokenizers 0.19.1