ST2_modernbert-large_hazard_V2
This model is a fine-tuned version of answerdotai/ModernBERT-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.6701
- F1: 0.8497
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 36
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
1.8843 | 1.0 | 128 | 0.8050 | 0.7972 |
0.7729 | 2.0 | 256 | 0.7046 | 0.8163 |
0.483 | 3.0 | 384 | 0.8013 | 0.8384 |
0.2768 | 4.0 | 512 | 0.8839 | 0.8348 |
0.1772 | 5.0 | 640 | 0.9444 | 0.8257 |
0.1544 | 6.0 | 768 | 0.9699 | 0.8292 |
0.1182 | 7.0 | 896 | 1.1826 | 0.8304 |
0.06 | 8.0 | 1024 | 1.0771 | 0.8334 |
0.0702 | 9.0 | 1152 | 1.0485 | 0.8435 |
0.0525 | 10.0 | 1280 | 1.0886 | 0.8406 |
0.0288 | 11.0 | 1408 | 1.2292 | 0.8484 |
0.0292 | 12.0 | 1536 | 1.1577 | 0.8513 |
0.0187 | 13.0 | 1664 | 1.2895 | 0.8478 |
0.012 | 14.0 | 1792 | 1.1460 | 0.8517 |
0.0066 | 15.0 | 1920 | 1.2281 | 0.8498 |
0.0048 | 16.0 | 2048 | 1.2578 | 0.8547 |
0.0034 | 17.0 | 2176 | 1.2525 | 0.8482 |
0.0028 | 18.0 | 2304 | 1.2799 | 0.8483 |
0.0038 | 19.0 | 2432 | 1.2747 | 0.8502 |
0.0022 | 20.0 | 2560 | 1.2907 | 0.8488 |
0.0021 | 21.0 | 2688 | 1.2864 | 0.8509 |
0.0034 | 22.0 | 2816 | 1.3089 | 0.8464 |
0.0029 | 23.0 | 2944 | 1.3077 | 0.8502 |
0.0015 | 24.0 | 3072 | 1.3103 | 0.8480 |
0.0021 | 25.0 | 3200 | 1.3275 | 0.8482 |
0.0033 | 26.0 | 3328 | 1.2898 | 0.8483 |
0.0015 | 27.0 | 3456 | 1.3258 | 0.8496 |
0.0018 | 28.0 | 3584 | 1.3326 | 0.8482 |
0.0022 | 29.0 | 3712 | 1.3351 | 0.8480 |
0.0027 | 30.0 | 3840 | 1.3325 | 0.8480 |
0.0014 | 31.0 | 3968 | 1.3183 | 0.8502 |
0.0023 | 32.0 | 4096 | 1.3379 | 0.8505 |
0.0026 | 33.0 | 4224 | 1.3498 | 0.8477 |
0.0009 | 34.0 | 4352 | 1.3428 | 0.8515 |
0.0027 | 35.0 | 4480 | 1.3274 | 0.8457 |
0.0024 | 36.0 | 4608 | 1.3600 | 0.8516 |
0.0003 | 37.0 | 4736 | 1.3427 | 0.8497 |
0.0029 | 38.0 | 4864 | 1.3627 | 0.8501 |
0.0023 | 39.0 | 4992 | 1.3649 | 0.8508 |
0.0017 | 40.0 | 5120 | 1.3472 | 0.8537 |
0.0028 | 41.0 | 5248 | 1.3738 | 0.8505 |
0.0021 | 42.0 | 5376 | 1.3650 | 0.8503 |
0.0014 | 43.0 | 5504 | 1.3771 | 0.8502 |
0.0014 | 44.0 | 5632 | 1.3775 | 0.8493 |
0.0013 | 45.0 | 5760 | 1.3687 | 0.8505 |
0.004 | 46.0 | 5888 | 1.3879 | 0.8480 |
0.0021 | 47.0 | 6016 | 1.3839 | 0.8513 |
0.0025 | 48.0 | 6144 | 1.3993 | 0.8505 |
0.002 | 49.0 | 6272 | 1.3779 | 0.8474 |
0.0708 | 50.0 | 6400 | 1.2382 | 0.7673 |
0.2818 | 51.0 | 6528 | 1.1139 | 0.8300 |
0.119 | 52.0 | 6656 | 1.1885 | 0.8333 |
0.0869 | 53.0 | 6784 | 1.3279 | 0.8517 |
0.0256 | 54.0 | 6912 | 1.2980 | 0.8349 |
0.0131 | 55.0 | 7040 | 1.3607 | 0.8446 |
0.0196 | 56.0 | 7168 | 1.3559 | 0.8439 |
0.0079 | 57.0 | 7296 | 1.3945 | 0.8471 |
0.0104 | 58.0 | 7424 | 1.3243 | 0.8511 |
0.0068 | 59.0 | 7552 | 1.3076 | 0.8447 |
0.0018 | 60.0 | 7680 | 1.3236 | 0.8504 |
0.0023 | 61.0 | 7808 | 1.3291 | 0.8528 |
0.002 | 62.0 | 7936 | 1.3434 | 0.8528 |
0.0018 | 63.0 | 8064 | 1.3511 | 0.8527 |
0.0011 | 64.0 | 8192 | 1.3616 | 0.8528 |
0.002 | 65.0 | 8320 | 1.3664 | 0.8527 |
0.0011 | 66.0 | 8448 | 1.3727 | 0.8518 |
0.0016 | 67.0 | 8576 | 1.3782 | 0.8499 |
0.0015 | 68.0 | 8704 | 1.3856 | 0.8499 |
0.0011 | 69.0 | 8832 | 1.3911 | 0.8499 |
0.0029 | 70.0 | 8960 | 1.3953 | 0.8499 |
0.0011 | 71.0 | 9088 | 1.3985 | 0.8481 |
0.0021 | 72.0 | 9216 | 1.3969 | 0.8490 |
0.0014 | 73.0 | 9344 | 1.4042 | 0.8496 |
0.0013 | 74.0 | 9472 | 1.4017 | 0.8490 |
0.0022 | 75.0 | 9600 | 1.4120 | 0.8472 |
0.0013 | 76.0 | 9728 | 1.4123 | 0.8478 |
0.0019 | 77.0 | 9856 | 1.4162 | 0.8464 |
0.0015 | 78.0 | 9984 | 1.4161 | 0.8472 |
0.0019 | 79.0 | 10112 | 1.4222 | 0.8457 |
0.0015 | 80.0 | 10240 | 1.4282 | 0.8464 |
0.0016 | 81.0 | 10368 | 1.4310 | 0.8457 |
0.0024 | 82.0 | 10496 | 1.4350 | 0.8457 |
0.0022 | 83.0 | 10624 | 1.4294 | 0.8457 |
0.001 | 84.0 | 10752 | 1.4353 | 0.8457 |
0.0013 | 85.0 | 10880 | 1.4411 | 0.8457 |
0.002 | 86.0 | 11008 | 1.4430 | 0.8457 |
0.0021 | 87.0 | 11136 | 1.4475 | 0.8457 |
0.0009 | 88.0 | 11264 | 1.4501 | 0.8464 |
0.0021 | 89.0 | 11392 | 1.4514 | 0.8474 |
0.0018 | 90.0 | 11520 | 1.4572 | 0.8474 |
0.0014 | 91.0 | 11648 | 1.4623 | 0.8474 |
0.0024 | 92.0 | 11776 | 1.4607 | 0.8474 |
0.0017 | 93.0 | 11904 | 1.4692 | 0.8465 |
0.0016 | 94.0 | 12032 | 1.4718 | 0.8474 |
0.0021 | 95.0 | 12160 | 1.4728 | 0.8474 |
0.0013 | 96.0 | 12288 | 1.4704 | 0.8474 |
0.0017 | 97.0 | 12416 | 1.4814 | 0.8465 |
0.0015 | 98.0 | 12544 | 1.4810 | 0.8465 |
0.0014 | 99.0 | 12672 | 1.4789 | 0.8465 |
0.0015 | 100.0 | 12800 | 1.4855 | 0.8465 |
0.0018 | 101.0 | 12928 | 1.4812 | 0.8479 |
0.0017 | 102.0 | 13056 | 1.4880 | 0.8465 |
0.0015 | 103.0 | 13184 | 1.4897 | 0.8465 |
0.0016 | 104.0 | 13312 | 1.4935 | 0.8465 |
0.0017 | 105.0 | 13440 | 1.4956 | 0.8465 |
0.0022 | 106.0 | 13568 | 1.5053 | 0.8472 |
0.0012 | 107.0 | 13696 | 1.5083 | 0.8485 |
0.0018 | 108.0 | 13824 | 1.4983 | 0.8472 |
0.0007 | 109.0 | 13952 | 1.5016 | 0.8465 |
0.0021 | 110.0 | 14080 | 1.5054 | 0.8485 |
0.0014 | 111.0 | 14208 | 1.5118 | 0.8472 |
0.0021 | 112.0 | 14336 | 1.5125 | 0.8463 |
0.0007 | 113.0 | 14464 | 1.5155 | 0.8485 |
0.0017 | 114.0 | 14592 | 1.5181 | 0.8472 |
0.0013 | 115.0 | 14720 | 1.5199 | 0.8485 |
0.001 | 116.0 | 14848 | 1.5237 | 0.8472 |
0.0015 | 117.0 | 14976 | 1.5314 | 0.8485 |
0.0016 | 118.0 | 15104 | 1.5173 | 0.8485 |
0.0008 | 119.0 | 15232 | 1.5214 | 0.8485 |
0.0023 | 120.0 | 15360 | 1.5386 | 0.8485 |
0.0016 | 121.0 | 15488 | 1.5263 | 0.8500 |
0.002 | 122.0 | 15616 | 1.5669 | 0.8459 |
0.0014 | 123.0 | 15744 | 1.5301 | 0.8498 |
0.0016 | 124.0 | 15872 | 1.5602 | 0.8523 |
0.0017 | 125.0 | 16000 | 1.5304 | 0.8466 |
0.0012 | 126.0 | 16128 | 1.5654 | 0.8505 |
0.0016 | 127.0 | 16256 | 1.5521 | 0.8485 |
0.0016 | 128.0 | 16384 | 1.5729 | 0.8471 |
0.002 | 129.0 | 16512 | 1.5592 | 0.8505 |
0.0012 | 130.0 | 16640 | 1.5771 | 0.8505 |
0.0014 | 131.0 | 16768 | 1.5593 | 0.8505 |
0.0023 | 132.0 | 16896 | 1.5780 | 0.8471 |
0.0015 | 133.0 | 17024 | 1.5709 | 0.8505 |
0.0011 | 134.0 | 17152 | 1.5751 | 0.8471 |
0.002 | 135.0 | 17280 | 1.5774 | 0.8505 |
0.0014 | 136.0 | 17408 | 1.5873 | 0.8471 |
0.0017 | 137.0 | 17536 | 1.5800 | 0.8471 |
0.0009 | 138.0 | 17664 | 1.5965 | 0.8469 |
0.0023 | 139.0 | 17792 | 1.5893 | 0.8473 |
0.0012 | 140.0 | 17920 | 1.5841 | 0.8474 |
0.0013 | 141.0 | 18048 | 1.5947 | 0.8511 |
0.0021 | 142.0 | 18176 | 1.5862 | 0.8506 |
0.0014 | 143.0 | 18304 | 1.5841 | 0.8476 |
0.001 | 144.0 | 18432 | 1.5877 | 0.8505 |
0.0017 | 145.0 | 18560 | 1.6091 | 0.8475 |
0.0016 | 146.0 | 18688 | 1.5897 | 0.8500 |
0.0015 | 147.0 | 18816 | 1.6097 | 0.8476 |
0.0013 | 148.0 | 18944 | 1.5787 | 0.8465 |
0.0011 | 149.0 | 19072 | 1.6175 | 0.8473 |
0.0018 | 150.0 | 19200 | 1.5987 | 0.8506 |
0.0013 | 151.0 | 19328 | 1.6082 | 0.8476 |
0.0006 | 152.0 | 19456 | 1.6167 | 0.8480 |
0.0027 | 153.0 | 19584 | 1.6071 | 0.8476 |
0.0013 | 154.0 | 19712 | 1.6168 | 0.8476 |
0.0018 | 155.0 | 19840 | 1.6168 | 0.8473 |
0.0009 | 156.0 | 19968 | 1.6226 | 0.8476 |
0.0016 | 157.0 | 20096 | 1.6181 | 0.8476 |
0.0013 | 158.0 | 20224 | 1.6293 | 0.8473 |
0.001 | 159.0 | 20352 | 1.6289 | 0.8476 |
0.0022 | 160.0 | 20480 | 1.6300 | 0.8476 |
0.001 | 161.0 | 20608 | 1.6345 | 0.8476 |
0.0015 | 162.0 | 20736 | 1.6372 | 0.8482 |
0.0019 | 163.0 | 20864 | 1.6363 | 0.8476 |
0.0009 | 164.0 | 20992 | 1.6382 | 0.8476 |
0.0017 | 165.0 | 21120 | 1.6452 | 0.8482 |
0.0006 | 166.0 | 21248 | 1.6411 | 0.8476 |
0.0017 | 167.0 | 21376 | 1.6400 | 0.8476 |
0.0017 | 168.0 | 21504 | 1.6405 | 0.8476 |
0.0015 | 169.0 | 21632 | 1.6504 | 0.8476 |
0.001 | 170.0 | 21760 | 1.6503 | 0.8482 |
0.0016 | 171.0 | 21888 | 1.6479 | 0.8480 |
0.0013 | 172.0 | 22016 | 1.6559 | 0.8482 |
0.001 | 173.0 | 22144 | 1.6468 | 0.8491 |
0.0017 | 174.0 | 22272 | 1.6544 | 0.8476 |
0.0011 | 175.0 | 22400 | 1.6523 | 0.8491 |
0.0013 | 176.0 | 22528 | 1.6539 | 0.8491 |
0.0012 | 177.0 | 22656 | 1.6566 | 0.8482 |
0.0015 | 178.0 | 22784 | 1.6589 | 0.8497 |
0.0015 | 179.0 | 22912 | 1.6624 | 0.8497 |
0.001 | 180.0 | 23040 | 1.6640 | 0.8497 |
0.0014 | 181.0 | 23168 | 1.6628 | 0.8497 |
0.0016 | 182.0 | 23296 | 1.6616 | 0.8491 |
0.0008 | 183.0 | 23424 | 1.6655 | 0.8497 |
0.0016 | 184.0 | 23552 | 1.6648 | 0.8491 |
0.0008 | 185.0 | 23680 | 1.6655 | 0.8491 |
0.0016 | 186.0 | 23808 | 1.6661 | 0.8491 |
0.0012 | 187.0 | 23936 | 1.6652 | 0.8491 |
0.001 | 188.0 | 24064 | 1.6690 | 0.8491 |
0.0014 | 189.0 | 24192 | 1.6676 | 0.8497 |
0.0012 | 190.0 | 24320 | 1.6671 | 0.8491 |
0.0012 | 191.0 | 24448 | 1.6690 | 0.8497 |
0.0008 | 192.0 | 24576 | 1.6696 | 0.8497 |
0.0016 | 193.0 | 24704 | 1.6691 | 0.8497 |
0.0012 | 194.0 | 24832 | 1.6703 | 0.8497 |
0.0012 | 195.0 | 24960 | 1.6701 | 0.8497 |
0.0014 | 196.0 | 25088 | 1.6689 | 0.8497 |
0.0014 | 197.0 | 25216 | 1.6698 | 0.8497 |
0.001 | 198.0 | 25344 | 1.6697 | 0.8497 |
0.0012 | 199.0 | 25472 | 1.6698 | 0.8497 |
0.0012 | 200.0 | 25600 | 1.6701 | 0.8497 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.4.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for BenPhan/ST2_modernbert-large_hazard_V2
Base model
answerdotai/ModernBERT-large