C0uchP0tat0's picture
End of training
6827db8
|
raw
history blame
4.18 kB
metadata
base_model: ai-forever/rugpt3medium_based_on_gpt2
tags:
  - generated_from_trainer
model-index:
  - name: my_rugpt3medium_finetune
    results: []

my_rugpt3medium_finetune

This model is a fine-tuned version of ai-forever/rugpt3medium_based_on_gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.3387

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 25
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
10.916 0.46 25 10.6340
10.3795 0.92 50 9.9985
9.9003 1.38 75 9.7015
9.6822 1.84 100 9.5795
9.5804 2.3 125 9.5130
9.5294 2.76 150 9.4485
9.439 3.22 175 9.3772
9.3698 3.68 200 9.2804
9.2964 4.14 225 9.1746
9.1945 4.6 250 9.0623
9.0492 5.06 275 8.9352
8.9521 5.52 300 8.8157
8.8634 5.98 325 8.6838
8.7197 6.44 350 8.5445
8.6485 6.9 375 8.4181
8.522 7.36 400 8.2732
8.4227 7.82 425 8.1704
8.3083 8.28 450 8.0290
8.1897 8.74 475 7.8989
8.0876 9.2 500 7.7778
7.9824 9.66 525 7.6368
7.8762 10.12 550 7.4974
7.7408 10.58 575 7.3658
7.6855 11.04 600 7.2416
7.5163 11.5 625 7.1291
7.5079 11.96 650 7.0295
7.2873 12.42 675 6.8522
7.2856 12.88 700 6.7573
7.0868 13.34 725 6.6651
7.0886 13.8 750 6.5239
6.9283 14.26 775 6.3561
6.8257 14.72 800 6.2392
6.7328 15.18 825 6.1004
6.6153 15.64 850 5.9846
6.5824 16.1 875 5.8627
6.3905 16.56 900 5.7724
6.359 17.02 925 5.6321
6.1679 17.48 950 5.5329
6.1526 17.94 975 5.4058
5.9604 18.4 1000 5.3046
5.9669 18.87 1025 5.1939
5.6807 19.33 1050 5.0499
5.7445 19.79 1075 4.9479
5.6578 20.25 1100 4.8343
5.4919 20.71 1125 4.7547
5.4427 21.17 1150 4.6506
5.3212 21.63 1175 4.5628
5.2953 22.09 1200 4.4814
5.1872 22.55 1225 4.4373
5.1285 23.01 1250 4.3966
5.047 23.47 1275 4.3611
5.0698 23.93 1300 4.3520
5.1259 24.39 1325 4.3408
4.9851 24.85 1350 4.3387

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0