Edit model card

Llama-3.1-8B-medquad-V2

This model is a fine-tuned version of meta-llama/Llama-3.1-8B on the MedQuAD: Ben-Abacha and Demner-Fushman (2019) dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8959

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 192
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: reduce_lr_on_plateau
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 7
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.2503 0.1462 10 1.1359
1.1182 0.2923 20 1.0199
1.0864 0.4385 30 0.9856
0.9031 0.5847 40 0.9681
1.0773 0.7308 50 0.9499
0.9575 0.8770 60 0.9427
0.9768 1.0231 70 0.9452
0.9673 1.1693 80 0.9264
0.8541 1.3155 90 0.9282
0.9772 1.4616 100 0.9180
0.8427 1.6078 110 0.9211
0.9317 1.7540 120 0.9142
0.9498 1.9001 130 0.9011
0.8412 2.0463 140 0.9036
0.899 2.1924 150 0.9031
0.7488 2.3386 160 0.8990
0.8824 2.4848 170 0.9033
0.8334 2.6309 180 0.8959

Framework versions

  • PEFT 0.13.0
  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
40
Inference Examples
Inference API (serverless) does not yet support peft models for this pipeline type.

Model tree for mariamoracrossitcr/Llama-3.1-8B-medquad-V2

Adapter
(111)
this model