Shaier's picture
update model card README.md
1088975
metadata
tags:
  - generated_from_trainer
model-index:
  - name: distilbert-base-uncased-continued_training-medqa
    results: []

distilbert-base-uncased-continued_training-medqa

This model is a fine-tuned version of Shaier/distilbert-base-uncased-continued_training-medqa on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5389

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 512
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 220
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 333 0.4516
No log 2.0 666 0.4277
No log 3.0 999 0.3734
No log 4.0 1332 0.4083
No log 5.0 1665 0.4134
No log 6.0 1998 0.5093
No log 7.0 2331 0.4639
0.4564 8.0 2664 0.5132
0.4564 9.0 2997 0.3483
0.4564 10.0 3330 0.4174
0.4564 11.0 3663 0.4975
0.4564 12.0 3996 0.4030
0.4564 13.0 4329 0.4476
0.4564 14.0 4662 0.3692
0.4564 15.0 4995 0.4474
0.4533 16.0 5328 0.3289
0.4533 17.0 5661 0.4647
0.4533 18.0 5994 0.4873
0.4533 19.0 6327 0.5323
0.4533 20.0 6660 0.4273
0.4533 21.0 6993 0.3426
0.4533 22.0 7326 0.3892
0.4533 23.0 7659 0.4297
0.4493 24.0 7992 0.4162
0.4493 25.0 8325 0.4424
0.4493 26.0 8658 0.4575
0.4493 27.0 8991 0.4192
0.4493 28.0 9324 0.4151
0.4493 29.0 9657 0.4321
0.4493 30.0 9990 0.4129
0.4493 31.0 10323 0.4869
0.4456 32.0 10656 0.4510
0.4456 33.0 10989 0.5263
0.4456 34.0 11322 0.3908
0.4456 35.0 11655 0.5016
0.4456 36.0 11988 0.4454
0.4456 37.0 12321 0.4011
0.4456 38.0 12654 0.4714
0.4456 39.0 12987 0.4972
0.443 40.0 13320 0.4200
0.443 41.0 13653 0.4659
0.443 42.0 13986 0.4758
0.443 43.0 14319 0.4509
0.443 44.0 14652 0.4211
0.443 45.0 14985 0.4007
0.443 46.0 15318 0.3205
0.443 47.0 15651 0.4479
0.4402 48.0 15984 0.4723
0.4402 49.0 16317 0.4956
0.4402 50.0 16650 0.4103
0.4402 51.0 16983 0.4234
0.4402 52.0 17316 0.4052
0.4402 53.0 17649 0.4033
0.4402 54.0 17982 0.4139
0.4402 55.0 18315 0.3618
0.4372 56.0 18648 0.5102
0.4372 57.0 18981 0.4166
0.4372 58.0 19314 0.4475
0.4372 59.0 19647 0.4259
0.4372 60.0 19980 0.4018
0.4372 61.0 20313 0.5005
0.4372 62.0 20646 0.4445
0.4372 63.0 20979 0.4280
0.434 64.0 21312 0.4533
0.434 65.0 21645 0.3672
0.434 66.0 21978 0.4726
0.434 67.0 22311 0.4084
0.434 68.0 22644 0.4508
0.434 69.0 22977 0.3746
0.434 70.0 23310 0.4703
0.434 71.0 23643 0.4789
0.4314 72.0 23976 0.3963
0.4314 73.0 24309 0.3800
0.4314 74.0 24642 0.5051
0.4314 75.0 24975 0.4245
0.4314 76.0 25308 0.4745
0.4314 77.0 25641 0.4351
0.4314 78.0 25974 0.4367
0.4314 79.0 26307 0.4200
0.4291 80.0 26640 0.4985
0.4291 81.0 26973 0.5058
0.4291 82.0 27306 0.4154
0.4291 83.0 27639 0.4837
0.4291 84.0 27972 0.3865
0.4291 85.0 28305 0.4357
0.4291 86.0 28638 0.3978
0.4291 87.0 28971 0.4413
0.4263 88.0 29304 0.4223
0.4263 89.0 29637 0.4241
0.4263 90.0 29970 0.4525
0.4263 91.0 30303 0.3895
0.4263 92.0 30636 0.4207
0.4263 93.0 30969 0.3217
0.4263 94.0 31302 0.3725
0.4263 95.0 31635 0.4354
0.4239 96.0 31968 0.4169
0.4239 97.0 32301 0.4873
0.4239 98.0 32634 0.4219
0.4239 99.0 32967 0.4984
0.4239 100.0 33300 0.4078
0.4239 101.0 33633 0.4463
0.4239 102.0 33966 0.3371
0.4239 103.0 34299 0.3896
0.422 104.0 34632 0.4743
0.422 105.0 34965 0.4931
0.422 106.0 35298 0.3574
0.422 107.0 35631 0.4127
0.422 108.0 35964 0.3892
0.422 109.0 36297 0.3881
0.422 110.0 36630 0.4221
0.422 111.0 36963 0.3924
0.4204 112.0 37296 0.4067
0.4204 113.0 37629 0.4357
0.4204 114.0 37962 0.4175
0.4204 115.0 38295 0.4424
0.4204 116.0 38628 0.3925
0.4204 117.0 38961 0.4693
0.4204 118.0 39294 0.3503
0.4204 119.0 39627 0.4761
0.4183 120.0 39960 0.3816
0.4183 121.0 40293 0.3903
0.4183 122.0 40626 0.3535
0.4183 123.0 40959 0.4388
0.4183 124.0 41292 0.4519
0.4183 125.0 41625 0.4241
0.4183 126.0 41958 0.4085
0.4183 127.0 42291 0.4836
0.4168 128.0 42624 0.4101
0.4168 129.0 42957 0.4749
0.4168 130.0 43290 0.4022
0.4168 131.0 43623 0.4861
0.4168 132.0 43956 0.4376
0.4168 133.0 44289 0.4597
0.4168 134.0 44622 0.4154
0.4168 135.0 44955 0.4431
0.415 136.0 45288 0.4887
0.415 137.0 45621 0.4229
0.415 138.0 45954 0.3997
0.415 139.0 46287 0.4185
0.415 140.0 46620 0.4633
0.415 141.0 46953 0.4061
0.415 142.0 47286 0.4604
0.415 143.0 47619 0.4047
0.4139 144.0 47952 0.4272
0.4139 145.0 48285 0.4783
0.4139 146.0 48618 0.3954
0.4139 147.0 48951 0.4501
0.4139 148.0 49284 0.4941
0.4139 149.0 49617 0.4112
0.4139 150.0 49950 0.4582
0.4139 151.0 50283 0.4361
0.4126 152.0 50616 0.3535
0.4126 153.0 50949 0.3797
0.4126 154.0 51282 0.4080
0.4126 155.0 51615 0.4049
0.4126 156.0 51948 0.4255
0.4126 157.0 52281 0.4303
0.4126 158.0 52614 0.4950
0.4126 159.0 52947 0.3721
0.4114 160.0 53280 0.2861
0.4114 161.0 53613 0.3775
0.4114 162.0 53946 0.4274
0.4114 163.0 54279 0.3904
0.4114 164.0 54612 0.4687
0.4114 165.0 54945 0.4013
0.4114 166.0 55278 0.4760
0.4114 167.0 55611 0.3554
0.4104 168.0 55944 0.5193
0.4104 169.0 56277 0.4476
0.4104 170.0 56610 0.5011
0.4104 171.0 56943 0.4441
0.4104 172.0 57276 0.4457
0.4104 173.0 57609 0.3792
0.4104 174.0 57942 0.5116
0.4104 175.0 58275 0.4249
0.4097 176.0 58608 0.3804
0.4097 177.0 58941 0.3886
0.4097 178.0 59274 0.4420
0.4097 179.0 59607 0.3573
0.4097 180.0 59940 0.3635
0.4097 181.0 60273 0.4596
0.4097 182.0 60606 0.3674
0.4097 183.0 60939 0.3869
0.409 184.0 61272 0.3909
0.409 185.0 61605 0.4339
0.409 186.0 61938 0.4475
0.409 187.0 62271 0.3218
0.409 188.0 62604 0.3771
0.409 189.0 62937 0.4007
0.409 190.0 63270 0.4520
0.409 191.0 63603 0.3980
0.4077 192.0 63936 0.4572
0.4077 193.0 64269 0.3952
0.4077 194.0 64602 0.4384
0.4077 195.0 64935 0.4795
0.4077 196.0 65268 0.3743
0.4077 197.0 65601 0.4445
0.4077 198.0 65934 0.3925
0.4077 199.0 66267 0.4564
0.4075 200.0 66600 0.4580
0.4075 201.0 66933 0.4446
0.4075 202.0 67266 0.4289
0.4075 203.0 67599 0.3722
0.4075 204.0 67932 0.4810
0.4075 205.0 68265 0.4004
0.4075 206.0 68598 0.4219
0.4075 207.0 68931 0.3926
0.407 208.0 69264 0.6043
0.407 209.0 69597 0.3835
0.407 210.0 69930 0.3791
0.407 211.0 70263 0.4152
0.407 212.0 70596 0.3654
0.407 213.0 70929 0.4434
0.407 214.0 71262 0.3613
0.407 215.0 71595 0.5103
0.4069 216.0 71928 0.3733
0.4069 217.0 72261 0.4881
0.4069 218.0 72594 0.3375
0.4069 219.0 72927 0.4766
0.4069 220.0 73260 0.4604

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0
  • Datasets 2.3.2
  • Tokenizers 0.11.0