File size: 4,055 Bytes

---
tags:
- generated_from_trainer
model-index:
- name: distilbert-base-uncased-continued_training-medqa
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# distilbert-base-uncased-continued_training-medqa

This model is a fine-tuned version of [Shaier/distilbert-base-uncased-continued_training-medqa](https://huggingface.co/Shaier/distilbert-base-uncased-continued_training-medqa) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4063

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 512
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 50
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| No log        | 1.0   | 333   | 0.4659          |
| No log        | 2.0   | 666   | 0.4547          |
| No log        | 3.0   | 999   | 0.3882          |
| No log        | 4.0   | 1332  | 0.4310          |
| No log        | 5.0   | 1665  | 0.4194          |
| No log        | 6.0   | 1998  | 0.5209          |
| No log        | 7.0   | 2331  | 0.4812          |
| 0.4829        | 8.0   | 2664  | 0.5321          |
| 0.4829        | 9.0   | 2997  | 0.3646          |
| 0.4829        | 10.0  | 3330  | 0.4339          |
| 0.4829        | 11.0  | 3663  | 0.5188          |
| 0.4829        | 12.0  | 3996  | 0.4148          |
| 0.4829        | 13.0  | 4329  | 0.4615          |
| 0.4829        | 14.0  | 4662  | 0.3825          |
| 0.4829        | 15.0  | 4995  | 0.4617          |
| 0.4773        | 16.0  | 5328  | 0.3400          |
| 0.4773        | 17.0  | 5661  | 0.4740          |
| 0.4773        | 18.0  | 5994  | 0.5057          |
| 0.4773        | 19.0  | 6327  | 0.5477          |
| 0.4773        | 20.0  | 6660  | 0.4426          |
| 0.4773        | 21.0  | 6993  | 0.3574          |
| 0.4773        | 22.0  | 7326  | 0.4031          |
| 0.4773        | 23.0  | 7659  | 0.4491          |
| 0.4715        | 24.0  | 7992  | 0.4340          |
| 0.4715        | 25.0  | 8325  | 0.4602          |
| 0.4715        | 26.0  | 8658  | 0.4659          |
| 0.4715        | 27.0  | 8991  | 0.4321          |
| 0.4715        | 28.0  | 9324  | 0.4335          |
| 0.4715        | 29.0  | 9657  | 0.4458          |
| 0.4715        | 30.0  | 9990  | 0.4285          |
| 0.4715        | 31.0  | 10323 | 0.5002          |
| 0.4671        | 32.0  | 10656 | 0.4706          |
| 0.4671        | 33.0  | 10989 | 0.5368          |
| 0.4671        | 34.0  | 11322 | 0.4028          |
| 0.4671        | 35.0  | 11655 | 0.5171          |
| 0.4671        | 36.0  | 11988 | 0.4506          |
| 0.4671        | 37.0  | 12321 | 0.4163          |
| 0.4671        | 38.0  | 12654 | 0.4905          |
| 0.4671        | 39.0  | 12987 | 0.5168          |
| 0.4646        | 40.0  | 13320 | 0.4412          |
| 0.4646        | 41.0  | 13653 | 0.4773          |
| 0.4646        | 42.0  | 13986 | 0.4835          |
| 0.4646        | 43.0  | 14319 | 0.4716          |
| 0.4646        | 44.0  | 14652 | 0.4431          |
| 0.4646        | 45.0  | 14985 | 0.4187          |
| 0.4646        | 46.0  | 15318 | 0.3389          |
| 0.4646        | 47.0  | 15651 | 0.4699          |
| 0.4628        | 48.0  | 15984 | 0.4880          |
| 0.4628        | 49.0  | 16317 | 0.5058          |
| 0.4628        | 50.0  | 16650 | 0.4275          |


### Framework versions

- Transformers 4.18.0
- Pytorch 1.11.0
- Datasets 2.3.2
- Tokenizers 0.11.0