ST2_modernbert-base_product_V1

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5744
  • F1: 0.5126

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 36
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss F1
6.5711 1.0 124 6.1700 0.0106
5.9394 2.0 248 4.8593 0.1670
4.6045 3.0 372 3.8976 0.3029
2.9945 4.0 496 3.3339 0.4090
0.6088 5.0 620 3.0378 0.4559
0.2897 6.0 744 3.0109 0.4816
0.1962 7.0 868 3.1202 0.4717
0.1025 8.0 992 3.0573 0.4715
0.0459 9.0 1116 3.1049 0.4814
0.0489 10.0 1240 3.1277 0.4863
0.0292 11.0 1364 3.0771 0.4875
0.0426 12.0 1488 3.1132 0.4936
0.0164 13.0 1612 3.0944 0.5108
0.0447 14.0 1736 3.1211 0.4910
0.0103 15.0 1860 3.0736 0.5010
0.0134 16.0 1984 3.1056 0.4982
0.0082 17.0 2108 3.0771 0.4997
0.0068 18.0 2232 3.0851 0.5018
0.0088 19.0 2356 3.0996 0.4991
0.0024 20.0 2480 3.0850 0.5086
0.0059 21.0 2604 3.0509 0.5026
0.0048 22.0 2728 3.1242 0.5047
0.004 23.0 2852 3.1142 0.4969
0.0041 24.0 2976 3.1523 0.5023
0.0041 25.0 3100 3.1448 0.4995
0.0028 26.0 3224 3.1528 0.5006
0.0057 27.0 3348 3.1483 0.5007
0.0029 28.0 3472 3.1724 0.5020
0.0033 29.0 3596 3.1834 0.4958
0.0034 30.0 3720 3.1474 0.4977
0.0038 31.0 3844 3.1788 0.4989
0.0037 32.0 3968 3.1785 0.4935
0.0034 33.0 4092 3.1674 0.4987
0.0027 34.0 4216 3.1771 0.4987
0.0041 35.0 4340 3.1944 0.4945
0.0019 36.0 4464 3.1890 0.5032
0.0028 37.0 4588 3.1763 0.4951
0.0022 38.0 4712 3.2159 0.4958
0.0037 39.0 4836 3.2109 0.5032
0.0026 40.0 4960 3.1828 0.4959
0.0028 41.0 5084 3.2274 0.4982
0.0025 42.0 5208 3.1930 0.4970
0.0036 43.0 5332 3.2186 0.4945
0.0032 44.0 5456 3.2385 0.5010
0.0016 45.0 5580 3.2317 0.5019
0.0034 46.0 5704 3.2135 0.5036
0.0035 47.0 5828 3.3843 0.4602
0.1754 48.0 5952 3.2086 0.4761
0.1549 49.0 6076 3.3204 0.4804
0.0232 50.0 6200 3.3169 0.4906
0.0173 51.0 6324 3.3614 0.4905
0.0149 52.0 6448 3.3885 0.4814
0.0085 53.0 6572 3.3473 0.4901
0.0049 54.0 6696 3.3235 0.4996
0.0027 55.0 6820 3.3272 0.4986
0.0013 56.0 6944 3.3385 0.5022
0.0039 57.0 7068 3.3398 0.5050
0.0025 58.0 7192 3.3412 0.5063
0.0025 59.0 7316 3.3466 0.5044
0.0022 60.0 7440 3.3510 0.5054
0.0018 61.0 7564 3.3570 0.5063
0.003 62.0 7688 3.3553 0.5047
0.0027 63.0 7812 3.3642 0.5053
0.0027 64.0 7936 3.3615 0.5061
0.0019 65.0 8060 3.3664 0.5053
0.003 66.0 8184 3.3675 0.5059
0.0028 67.0 8308 3.3707 0.5063
0.0023 68.0 8432 3.3754 0.5060
0.0018 69.0 8556 3.3743 0.5061
0.0039 70.0 8680 3.3793 0.5080
0.0029 71.0 8804 3.3868 0.5101
0.0017 72.0 8928 3.3829 0.5062
0.0035 73.0 9052 3.3913 0.5072
0.0026 74.0 9176 3.3910 0.5085
0.0029 75.0 9300 3.3827 0.5110
0.0018 76.0 9424 3.3985 0.5084
0.0027 77.0 9548 3.3946 0.5068
0.0021 78.0 9672 3.3975 0.5113
0.0025 79.0 9796 3.3949 0.5066
0.0028 80.0 9920 3.4022 0.5098
0.0015 81.0 10044 3.4100 0.5082
0.0026 82.0 10168 3.3912 0.5120
0.0028 83.0 10292 3.4092 0.5122
0.0031 84.0 10416 3.3857 0.5125
0.002 85.0 10540 3.4220 0.5096
0.0013 86.0 10664 3.4071 0.5141
0.003 87.0 10788 3.4105 0.5148
0.002 88.0 10912 3.4124 0.5130
0.0025 89.0 11036 3.4248 0.5123
0.0028 90.0 11160 3.4086 0.5119
0.0022 91.0 11284 3.4224 0.5113
0.0024 92.0 11408 3.4295 0.5162
0.0022 93.0 11532 3.4159 0.5138
0.0025 94.0 11656 3.4153 0.5130
0.0023 95.0 11780 3.4355 0.5133
0.0027 96.0 11904 3.4323 0.5174
0.0018 97.0 12028 3.3888 0.5160
0.0029 98.0 12152 3.4415 0.5125
0.0028 99.0 12276 3.4289 0.5122
0.0024 100.0 12400 3.4408 0.5171
0.002 101.0 12524 3.4226 0.5148
0.0025 102.0 12648 3.4544 0.5147
0.0019 103.0 12772 3.4467 0.5148
0.0024 104.0 12896 3.4552 0.5188
0.0026 105.0 13020 3.4581 0.5178
0.0023 106.0 13144 3.4570 0.5159
0.0019 107.0 13268 3.4456 0.5136
0.0019 108.0 13392 3.4553 0.5170
0.0024 109.0 13516 3.4750 0.5115
0.0015 110.0 13640 3.4556 0.5195
0.0027 111.0 13764 3.4916 0.5207
0.0023 112.0 13888 3.4637 0.5120
0.0014 113.0 14012 3.4714 0.5141
0.0026 114.0 14136 3.4919 0.5182
0.0024 115.0 14260 3.4987 0.5169
0.0016 116.0 14384 3.5065 0.5179
0.0023 117.0 14508 3.4585 0.5154
0.0019 118.0 14632 3.4927 0.5139
0.0014 119.0 14756 3.4963 0.5150
0.0031 120.0 14880 3.5130 0.5190
0.0021 121.0 15004 3.4772 0.5117
0.0021 122.0 15128 3.5224 0.5131
0.003 123.0 15252 3.4794 0.5165
0.0013 124.0 15376 3.4911 0.5099
0.0064 125.0 15500 3.5023 0.4909
0.0238 126.0 15624 3.3523 0.5030
0.015 127.0 15748 3.4065 0.5028
0.0039 128.0 15872 3.3460 0.4963
0.0067 129.0 15996 3.3763 0.5062
0.0021 130.0 16120 3.3880 0.5077
0.0018 131.0 16244 3.3969 0.5093
0.0022 132.0 16368 3.4017 0.5100
0.0021 133.0 16492 3.4123 0.5084
0.002 134.0 16616 3.4158 0.5122
0.0019 135.0 16740 3.4215 0.5117
0.0017 136.0 16864 3.4257 0.5103
0.0023 137.0 16988 3.4289 0.5141
0.0018 138.0 17112 3.4344 0.5101
0.0023 139.0 17236 3.4371 0.5110
0.0014 140.0 17360 3.4411 0.5133
0.0019 141.0 17484 3.4437 0.5127
0.002 142.0 17608 3.4484 0.5138
0.002 143.0 17732 3.4503 0.5127
0.0017 144.0 17856 3.4534 0.5117
0.0015 145.0 17980 3.4578 0.5143
0.0015 146.0 18104 3.4613 0.5099
0.002 147.0 18228 3.4645 0.5109
0.0012 148.0 18352 3.4679 0.5097
0.0026 149.0 18476 3.4691 0.5092
0.0016 150.0 18600 3.4734 0.5088
0.0017 151.0 18724 3.4754 0.5102
0.0023 152.0 18848 3.4798 0.5114
0.0024 153.0 18972 3.4822 0.5082
0.001 154.0 19096 3.4831 0.5104
0.002 155.0 19220 3.4885 0.5083
0.0016 156.0 19344 3.4904 0.5104
0.0014 157.0 19468 3.4935 0.5101
0.0021 158.0 19592 3.4980 0.5103
0.0021 159.0 19716 3.4991 0.5113
0.0015 160.0 19840 3.5031 0.5100
0.0026 161.0 19964 3.5050 0.5101
0.0013 162.0 20088 3.5095 0.5089
0.0018 163.0 20212 3.5128 0.5099
0.0021 164.0 20336 3.5148 0.5116
0.0021 165.0 20460 3.5149 0.5116
0.0017 166.0 20584 3.5189 0.5095
0.0018 167.0 20708 3.5232 0.5120
0.0018 168.0 20832 3.5277 0.5099
0.0024 169.0 20956 3.5277 0.5115
0.0015 170.0 21080 3.5282 0.5111
0.0015 171.0 21204 3.5295 0.5107
0.0013 172.0 21328 3.5328 0.5105
0.0044 173.0 21452 3.5402 0.5098
0.0023 174.0 21576 3.5429 0.5120
0.0021 175.0 21700 3.5419 0.5099
0.0014 176.0 21824 3.5467 0.5116
0.0025 177.0 21948 3.5475 0.5122
0.0017 178.0 22072 3.5460 0.5117
0.0015 179.0 22196 3.5513 0.5108
0.0019 180.0 22320 3.5513 0.5135
0.0016 181.0 22444 3.5539 0.5128
0.0022 182.0 22568 3.5585 0.5131
0.0013 183.0 22692 3.5599 0.5150
0.0012 184.0 22816 3.5590 0.5151
0.0023 185.0 22940 3.5587 0.5142
0.002 186.0 23064 3.5601 0.5145
0.0011 187.0 23188 3.5630 0.5133
0.0019 188.0 23312 3.5662 0.5163
0.0021 189.0 23436 3.5643 0.5132
0.0012 190.0 23560 3.5684 0.5128
0.0021 191.0 23684 3.5681 0.5138
0.0019 192.0 23808 3.5700 0.5139
0.0018 193.0 23932 3.5721 0.5137
0.0019 194.0 24056 3.5742 0.5161
0.0017 195.0 24180 3.5719 0.5118
0.0016 196.0 24304 3.5718 0.5160
0.0018 197.0 24428 3.5737 0.5135
0.002 198.0 24552 3.5740 0.5144
0.001 199.0 24676 3.5746 0.5143
0.0014 200.0 24800 3.5744 0.5126

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
17
Safetensors
Model size
150M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BenPhan/ST2_modernbert-base_product_V1

Finetuned
(212)
this model