botika's picture
update model card README.md
5dab949
|
raw
history blame
6.46 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: checkpoint-124500-finetuned-squad
    results: []

checkpoint-124500-finetuned-squad

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 14.9594

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
3.9975 1.0 3289 3.8405
3.7311 2.0 6578 3.7114
3.5681 3.0 9867 3.6829
3.4101 4.0 13156 3.6368
3.2487 5.0 16445 3.6526
3.1143 6.0 19734 3.7567
2.9783 7.0 23023 3.8469
2.8295 8.0 26312 4.0040
2.6912 9.0 29601 4.1996
2.5424 10.0 32890 4.3387
2.4161 11.0 36179 4.4988
2.2713 12.0 39468 4.7861
2.1413 13.0 42757 4.9276
2.0125 14.0 46046 5.0598
1.8798 15.0 49335 5.3347
1.726 16.0 52624 5.5869
1.5994 17.0 55913 5.7161
1.4643 18.0 59202 6.0174
1.3237 19.0 62491 6.4926
1.2155 20.0 65780 6.4882
1.1029 21.0 69069 6.9922
0.9948 22.0 72358 7.1357
0.9038 23.0 75647 7.3676
0.8099 24.0 78936 7.4180
0.7254 25.0 82225 7.7753
0.6598 26.0 85514 7.8643
0.5723 27.0 88803 8.1798
0.5337 28.0 92092 8.3053
0.4643 29.0 95381 8.8597
0.4241 30.0 98670 8.9849
0.3763 31.0 101959 8.8406
0.3479 32.0 105248 9.1517
0.3271 33.0 108537 9.3659
0.2911 34.0 111826 9.4813
0.2836 35.0 115115 9.5746
0.2528 36.0 118404 9.7027
0.2345 37.0 121693 9.7515
0.2184 38.0 124982 9.9729
0.2067 39.0 128271 10.0828
0.2077 40.0 131560 10.0878
0.1876 41.0 134849 10.2974
0.1719 42.0 138138 10.2712
0.1637 43.0 141427 10.5788
0.1482 44.0 144716 10.7465
0.1509 45.0 148005 10.4603
0.1358 46.0 151294 10.7665
0.1316 47.0 154583 10.7724
0.1223 48.0 157872 11.1766
0.1205 49.0 161161 11.1870
0.1203 50.0 164450 11.1053
0.1081 51.0 167739 10.9696
0.103 52.0 171028 11.2010
0.0938 53.0 174317 11.6728
0.0924 54.0 177606 11.1423
0.0922 55.0 180895 11.7409
0.0827 56.0 184184 11.7850
0.0829 57.0 187473 11.8956
0.073 58.0 190762 11.8915
0.0788 59.0 194051 12.1617
0.0734 60.0 197340 12.2007
0.0729 61.0 200629 12.2388
0.0663 62.0 203918 12.2471
0.0662 63.0 207207 12.5830
0.064 64.0 210496 12.6105
0.0599 65.0 213785 12.3712
0.0604 66.0 217074 12.9249
0.0574 67.0 220363 12.7309
0.0538 68.0 223652 12.8068
0.0526 69.0 226941 13.4368
0.0471 70.0 230230 13.5148
0.0436 71.0 233519 13.3391
0.0448 72.0 236808 13.4100
0.0428 73.0 240097 13.5617
0.0401 74.0 243386 13.8674
0.035 75.0 246675 13.5746
0.0342 76.0 249964 13.5042
0.0344 77.0 253253 14.2085
0.0365 78.0 256542 13.6393
0.0306 79.0 259831 13.9807
0.0311 80.0 263120 13.9768
0.0353 81.0 266409 14.5245
0.0299 82.0 269698 13.9471
0.0263 83.0 272987 13.7899
0.0254 84.0 276276 14.3786
0.0267 85.0 279565 14.5611
0.022 86.0 282854 14.2658
0.0198 87.0 286143 14.9215
0.0193 88.0 289432 14.5650
0.0228 89.0 292721 14.7014
0.0184 90.0 296010 14.6946
0.0182 91.0 299299 14.6614
0.0188 92.0 302588 14.6915
0.0196 93.0 305877 14.7262
0.0138 94.0 309166 14.7625
0.0201 95.0 312455 15.0442
0.0189 96.0 315744 14.8832
0.0148 97.0 319033 14.8995
0.0129 98.0 322322 14.8974
0.0132 99.0 325611 14.9813
0.0139 100.0 328900 14.9594

Framework versions

  • Transformers 4.19.2
  • Pytorch 1.11.0+cu102
  • Datasets 2.2.2
  • Tokenizers 0.12.1