haryoaw's picture
Upload tokenizer
c6bb938 verified
|
raw
history blame
4.81 kB
metadata
base_model: microsoft/mdeberta-v3-base
library_name: transformers
license: mit
metrics:
  - precision
  - recall
  - f1
  - accuracy
tags:
  - generated_from_trainer
model-index:
  - name: scenario-kd-pre-ner-full-mdeberta_data-univner_full44
    results: []

scenario-kd-pre-ner-full-mdeberta_data-univner_full44

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2650
  • Precision: 0.8107
  • Recall: 0.8117
  • F1: 0.8112
  • Accuracy: 0.9806

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 44
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
1.3559 0.2911 500 0.7891 0.4246 0.3525 0.3852 0.9433
0.7017 0.5822 1000 0.5422 0.6316 0.6298 0.6307 0.9646
0.528 0.8732 1500 0.4693 0.6856 0.6830 0.6843 0.9692
0.4354 1.1643 2000 0.4211 0.7101 0.7376 0.7236 0.9724
0.385 1.4554 2500 0.3893 0.7482 0.7374 0.7428 0.9747
0.3575 1.7465 3000 0.3713 0.7678 0.7331 0.7500 0.9752
0.3298 2.0375 3500 0.3550 0.7497 0.7800 0.7645 0.9761
0.2879 2.3286 4000 0.3492 0.7964 0.7367 0.7654 0.9763
0.2748 2.6197 4500 0.3272 0.7660 0.7924 0.7790 0.9782
0.2644 2.9108 5000 0.3192 0.7817 0.7811 0.7814 0.9779
0.2416 3.2019 5500 0.3239 0.8004 0.7681 0.7839 0.9782
0.2303 3.4929 6000 0.3085 0.7846 0.7966 0.7905 0.9787
0.2252 3.7840 6500 0.3051 0.7973 0.7883 0.7928 0.9787
0.2159 4.0751 7000 0.3045 0.7987 0.7908 0.7948 0.9790
0.2067 4.3662 7500 0.2979 0.7969 0.7943 0.7956 0.9793
0.2028 4.6573 8000 0.2924 0.7855 0.8132 0.7991 0.9792
0.1985 4.9483 8500 0.2904 0.8008 0.7986 0.7997 0.9791
0.1867 5.2394 9000 0.2884 0.8 0.8033 0.8017 0.9797
0.1838 5.5305 9500 0.2841 0.7997 0.8220 0.8107 0.9800
0.1838 5.8216 10000 0.2810 0.7895 0.8165 0.8028 0.9798
0.1786 6.1126 10500 0.2767 0.8065 0.8150 0.8108 0.9802
0.1719 6.4037 11000 0.2790 0.8133 0.8057 0.8095 0.9803
0.1706 6.6948 11500 0.2795 0.8140 0.7983 0.8061 0.9802
0.1695 6.9859 12000 0.2723 0.8124 0.8121 0.8123 0.9807
0.1638 7.2770 12500 0.2726 0.8070 0.8078 0.8074 0.9803
0.162 7.5680 13000 0.2724 0.8118 0.8173 0.8146 0.9807
0.1619 7.8591 13500 0.2678 0.8018 0.8235 0.8125 0.9805
0.1594 8.1502 14000 0.2719 0.8103 0.8068 0.8086 0.9800
0.1571 8.4413 14500 0.2688 0.8097 0.8127 0.8112 0.9805
0.1585 8.7324 15000 0.2673 0.8126 0.8150 0.8138 0.9806
0.1546 9.0234 15500 0.2658 0.8105 0.8120 0.8112 0.9805
0.1534 9.3145 16000 0.2652 0.8101 0.8198 0.8149 0.9807
0.1535 9.6056 16500 0.2646 0.8097 0.8140 0.8119 0.9807
0.1531 9.8967 17000 0.2650 0.8107 0.8117 0.8112 0.9806

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.19.1