Shaier's picture
update model card README.md
d76e9db
|
raw
history blame
4.06 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: distilbert-base-uncased-continued_training-medqa
    results: []

distilbert-base-uncased-continued_training-medqa

This model is a fine-tuned version of Shaier/distilbert-base-uncased-continued_training-medqa on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4063

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 512
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 333 0.4659
No log 2.0 666 0.4547
No log 3.0 999 0.3882
No log 4.0 1332 0.4310
No log 5.0 1665 0.4194
No log 6.0 1998 0.5209
No log 7.0 2331 0.4812
0.4829 8.0 2664 0.5321
0.4829 9.0 2997 0.3646
0.4829 10.0 3330 0.4339
0.4829 11.0 3663 0.5188
0.4829 12.0 3996 0.4148
0.4829 13.0 4329 0.4615
0.4829 14.0 4662 0.3825
0.4829 15.0 4995 0.4617
0.4773 16.0 5328 0.3400
0.4773 17.0 5661 0.4740
0.4773 18.0 5994 0.5057
0.4773 19.0 6327 0.5477
0.4773 20.0 6660 0.4426
0.4773 21.0 6993 0.3574
0.4773 22.0 7326 0.4031
0.4773 23.0 7659 0.4491
0.4715 24.0 7992 0.4340
0.4715 25.0 8325 0.4602
0.4715 26.0 8658 0.4659
0.4715 27.0 8991 0.4321
0.4715 28.0 9324 0.4335
0.4715 29.0 9657 0.4458
0.4715 30.0 9990 0.4285
0.4715 31.0 10323 0.5002
0.4671 32.0 10656 0.4706
0.4671 33.0 10989 0.5368
0.4671 34.0 11322 0.4028
0.4671 35.0 11655 0.5171
0.4671 36.0 11988 0.4506
0.4671 37.0 12321 0.4163
0.4671 38.0 12654 0.4905
0.4671 39.0 12987 0.5168
0.4646 40.0 13320 0.4412
0.4646 41.0 13653 0.4773
0.4646 42.0 13986 0.4835
0.4646 43.0 14319 0.4716
0.4646 44.0 14652 0.4431
0.4646 45.0 14985 0.4187
0.4646 46.0 15318 0.3389
0.4646 47.0 15651 0.4699
0.4628 48.0 15984 0.4880
0.4628 49.0 16317 0.5058
0.4628 50.0 16650 0.4275

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.11.0
  • Datasets 2.3.2
  • Tokenizers 0.11.0