Edit model card

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_166

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9580
  • Accuracy: 0.8146
  • F1: 0.7901

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 66
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
1.2987 0.2672 5000 1.2851 0.6566 0.5593
0.9658 0.5344 10000 0.9963 0.7328 0.6604
0.808 0.8017 15000 0.8786 0.7661 0.7077
0.5621 1.0689 20000 0.8689 0.7807 0.7298
0.5504 1.3361 25000 0.8296 0.7897 0.7417
0.5334 1.6033 30000 0.8176 0.7940 0.7494
0.5188 1.8706 35000 0.7691 0.8056 0.7619
0.3426 2.1378 40000 0.8369 0.8057 0.7691
0.3362 2.4050 45000 0.8393 0.8053 0.7663
0.355 2.6722 50000 0.8216 0.8075 0.7742
0.3452 2.9394 55000 0.8262 0.8108 0.7767
0.2176 3.2067 60000 0.9217 0.8061 0.7754
0.2271 3.4739 65000 0.9242 0.8093 0.7800
0.2366 3.7411 70000 0.9183 0.8127 0.7835
0.2095 4.0083 75000 0.9789 0.8126 0.7802
0.146 4.2756 80000 1.0693 0.8097 0.7796
0.1526 4.5428 85000 1.0715 0.8119 0.7845
0.1606 4.8100 90000 1.0722 0.8150 0.7869
0.0916 5.0772 95000 1.2010 0.8157 0.7908
0.0957 5.3444 100000 1.2828 0.8112 0.7826
0.1065 5.6117 105000 1.2375 0.8146 0.7856
0.0993 5.8789 110000 1.2607 0.8141 0.7877
0.0583 6.1461 115000 1.4788 0.8122 0.7847
0.0688 6.4133 120000 1.4891 0.8117 0.7881
0.0673 6.6806 125000 1.5137 0.8129 0.7840
0.0781 6.9478 130000 1.5225 0.8110 0.7867
0.0453 7.2150 135000 1.6409 0.8132 0.7860
0.0487 7.4822 140000 1.6796 0.8125 0.7872
0.041 7.7495 145000 1.7356 0.8121 0.7864
0.035 8.0167 150000 1.7378 0.8139 0.7905
0.0293 8.2839 155000 1.8422 0.8118 0.7885
0.0289 8.5511 160000 1.8606 0.8126 0.7885
0.0271 8.8183 165000 1.8730 0.8140 0.7899
0.0186 9.0856 170000 1.9062 0.8139 0.7898
0.0197 9.3528 175000 1.9150 0.8142 0.7895
0.0274 9.6200 180000 1.9638 0.8133 0.7891
0.018 9.8872 185000 1.9580 0.8146 0.7901

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
236M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for haryoaw/scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_166

Finetuned
(206)
this model

Evaluation results