vit_rand_rvl-cdip_N1K
This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.9745
- Accuracy: 0.551
- Brier Loss: 0.8083
- Nll: 3.9609
- F1 Micro: 0.551
- F1 Macro: 0.5474
- Ece: 0.3805
- Aurc: 0.2338
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 250 | 2.6207 | 0.171 | 0.9078 | 5.8097 | 0.171 | 0.1129 | 0.0606 | 0.7132 |
2.6241 | 2.0 | 500 | 2.4608 | 0.1727 | 0.8843 | 4.0297 | 0.1727 | 0.1156 | 0.0641 | 0.6991 |
2.6241 | 3.0 | 750 | 2.4182 | 0.2177 | 0.8659 | 4.1324 | 0.2177 | 0.1603 | 0.0802 | 0.6191 |
2.3655 | 4.0 | 1000 | 2.2066 | 0.2828 | 0.8237 | 3.3597 | 0.2828 | 0.2456 | 0.0597 | 0.5384 |
2.3655 | 5.0 | 1250 | 2.0873 | 0.3322 | 0.7923 | 3.2747 | 0.3322 | 0.2940 | 0.0613 | 0.4790 |
2.0557 | 6.0 | 1500 | 1.9178 | 0.398 | 0.7392 | 3.1146 | 0.398 | 0.3639 | 0.0589 | 0.3937 |
2.0557 | 7.0 | 1750 | 1.7861 | 0.458 | 0.7025 | 2.9045 | 0.458 | 0.4450 | 0.0778 | 0.3497 |
1.7262 | 8.0 | 2000 | 1.7288 | 0.4535 | 0.6821 | 2.9955 | 0.4535 | 0.4322 | 0.0528 | 0.3262 |
1.7262 | 9.0 | 2250 | 1.6881 | 0.472 | 0.6673 | 2.8844 | 0.472 | 0.4561 | 0.0563 | 0.3120 |
1.4846 | 10.0 | 2500 | 1.6912 | 0.4688 | 0.6633 | 2.8541 | 0.4688 | 0.4540 | 0.0718 | 0.3006 |
1.4846 | 11.0 | 2750 | 1.6094 | 0.5022 | 0.6353 | 2.8239 | 0.5022 | 0.4859 | 0.0759 | 0.2724 |
1.1972 | 12.0 | 3000 | 1.5364 | 0.535 | 0.6084 | 2.7911 | 0.535 | 0.5162 | 0.0905 | 0.2413 |
1.1972 | 13.0 | 3250 | 1.5683 | 0.521 | 0.6228 | 2.7486 | 0.521 | 0.5268 | 0.1003 | 0.2559 |
0.8678 | 14.0 | 3500 | 1.6246 | 0.5325 | 0.6246 | 2.8388 | 0.5325 | 0.5295 | 0.1304 | 0.2486 |
0.8678 | 15.0 | 3750 | 1.7502 | 0.5138 | 0.6555 | 2.9705 | 0.5138 | 0.5093 | 0.1750 | 0.2547 |
0.5268 | 16.0 | 4000 | 1.8375 | 0.5215 | 0.6677 | 2.9906 | 0.5215 | 0.5186 | 0.2099 | 0.2535 |
0.5268 | 17.0 | 4250 | 1.9606 | 0.524 | 0.6895 | 3.2415 | 0.524 | 0.5174 | 0.2425 | 0.2488 |
0.2667 | 18.0 | 4500 | 2.0553 | 0.5305 | 0.6953 | 3.2430 | 0.5305 | 0.5223 | 0.2554 | 0.2434 |
0.2667 | 19.0 | 4750 | 2.3400 | 0.5228 | 0.7369 | 3.5472 | 0.5228 | 0.5101 | 0.2871 | 0.2605 |
0.1513 | 20.0 | 5000 | 2.3720 | 0.5192 | 0.7472 | 3.4681 | 0.5192 | 0.5178 | 0.2982 | 0.2674 |
0.1513 | 21.0 | 5250 | 2.4935 | 0.52 | 0.7588 | 3.4578 | 0.52 | 0.5104 | 0.3101 | 0.2586 |
0.1164 | 22.0 | 5500 | 2.4916 | 0.5155 | 0.7625 | 3.3908 | 0.5155 | 0.5090 | 0.3129 | 0.2634 |
0.1164 | 23.0 | 5750 | 2.5740 | 0.523 | 0.7647 | 3.4298 | 0.523 | 0.5235 | 0.3220 | 0.2601 |
0.0883 | 24.0 | 6000 | 2.5887 | 0.5305 | 0.7598 | 3.4432 | 0.5305 | 0.5307 | 0.3194 | 0.2571 |
0.0883 | 25.0 | 6250 | 2.7429 | 0.52 | 0.7747 | 3.7692 | 0.52 | 0.5132 | 0.3291 | 0.2696 |
0.0739 | 26.0 | 6500 | 2.7728 | 0.5235 | 0.7828 | 3.4718 | 0.5235 | 0.5271 | 0.3399 | 0.2679 |
0.0739 | 27.0 | 6750 | 2.7862 | 0.5335 | 0.7680 | 3.5774 | 0.5335 | 0.5352 | 0.3256 | 0.2651 |
0.0619 | 28.0 | 7000 | 2.9449 | 0.5222 | 0.7964 | 3.6659 | 0.5222 | 0.5165 | 0.3503 | 0.2697 |
0.0619 | 29.0 | 7250 | 2.8872 | 0.5345 | 0.7714 | 3.5298 | 0.5345 | 0.5310 | 0.3376 | 0.2545 |
0.0531 | 30.0 | 7500 | 2.9649 | 0.5232 | 0.7994 | 3.6119 | 0.5232 | 0.5191 | 0.3527 | 0.2714 |
0.0531 | 31.0 | 7750 | 3.1024 | 0.5182 | 0.8112 | 3.6716 | 0.5182 | 0.5206 | 0.3639 | 0.2748 |
0.0446 | 32.0 | 8000 | 3.0895 | 0.5218 | 0.8036 | 3.6731 | 0.5218 | 0.5226 | 0.3609 | 0.2669 |
0.0446 | 33.0 | 8250 | 3.1813 | 0.5202 | 0.8130 | 3.6839 | 0.5202 | 0.5236 | 0.3675 | 0.2637 |
0.0368 | 34.0 | 8500 | 3.2535 | 0.5335 | 0.8011 | 3.6982 | 0.5335 | 0.5302 | 0.3653 | 0.2572 |
0.0368 | 35.0 | 8750 | 3.1969 | 0.5265 | 0.8021 | 3.7238 | 0.5265 | 0.5239 | 0.3649 | 0.2558 |
0.0364 | 36.0 | 9000 | 3.3875 | 0.5165 | 0.8174 | 4.0335 | 0.5165 | 0.5051 | 0.3675 | 0.2645 |
0.0364 | 37.0 | 9250 | 3.3883 | 0.5248 | 0.8168 | 3.8867 | 0.5248 | 0.5152 | 0.3768 | 0.2529 |
0.0338 | 38.0 | 9500 | 3.3876 | 0.5255 | 0.8198 | 3.6397 | 0.5255 | 0.5278 | 0.3791 | 0.2679 |
0.0338 | 39.0 | 9750 | 3.3675 | 0.5282 | 0.8201 | 3.7412 | 0.5282 | 0.5317 | 0.3774 | 0.2561 |
0.0277 | 40.0 | 10000 | 3.6788 | 0.5005 | 0.8597 | 4.1427 | 0.5005 | 0.4880 | 0.3966 | 0.2757 |
0.0277 | 41.0 | 10250 | 3.5608 | 0.522 | 0.8299 | 3.7769 | 0.522 | 0.5230 | 0.3828 | 0.2749 |
0.0177 | 42.0 | 10500 | 3.6388 | 0.5275 | 0.8242 | 4.0808 | 0.5275 | 0.5134 | 0.3817 | 0.2508 |
0.0177 | 43.0 | 10750 | 3.7068 | 0.532 | 0.8199 | 4.1084 | 0.532 | 0.5198 | 0.3809 | 0.2480 |
0.018 | 44.0 | 11000 | 3.7589 | 0.5258 | 0.8315 | 3.9264 | 0.5258 | 0.5172 | 0.3877 | 0.2624 |
0.018 | 45.0 | 11250 | 3.7492 | 0.518 | 0.8437 | 3.9257 | 0.518 | 0.5180 | 0.3951 | 0.2684 |
0.0186 | 46.0 | 11500 | 3.7641 | 0.5275 | 0.8306 | 3.9749 | 0.5275 | 0.5277 | 0.3877 | 0.2595 |
0.0186 | 47.0 | 11750 | 3.8842 | 0.52 | 0.8491 | 4.1807 | 0.52 | 0.5182 | 0.3949 | 0.2658 |
0.0159 | 48.0 | 12000 | 3.8731 | 0.5292 | 0.8318 | 3.9345 | 0.5292 | 0.5250 | 0.3902 | 0.2618 |
0.0159 | 49.0 | 12250 | 4.0101 | 0.519 | 0.8552 | 4.0796 | 0.519 | 0.5198 | 0.4025 | 0.2713 |
0.0118 | 50.0 | 12500 | 3.8631 | 0.5255 | 0.8288 | 4.0855 | 0.5255 | 0.5245 | 0.3891 | 0.2600 |
0.0118 | 51.0 | 12750 | 3.7895 | 0.5415 | 0.8143 | 3.9602 | 0.5415 | 0.5441 | 0.3809 | 0.2506 |
0.0125 | 52.0 | 13000 | 3.9434 | 0.523 | 0.8385 | 4.2268 | 0.523 | 0.5136 | 0.3951 | 0.2623 |
0.0125 | 53.0 | 13250 | 3.9239 | 0.5275 | 0.8391 | 4.0398 | 0.5275 | 0.5255 | 0.3952 | 0.2632 |
0.0087 | 54.0 | 13500 | 3.9463 | 0.5323 | 0.8307 | 4.1080 | 0.5323 | 0.5275 | 0.3905 | 0.2580 |
0.0087 | 55.0 | 13750 | 3.8462 | 0.5367 | 0.8210 | 3.9693 | 0.5367 | 0.5375 | 0.3825 | 0.2595 |
0.0093 | 56.0 | 14000 | 4.0603 | 0.5208 | 0.8449 | 4.2501 | 0.5208 | 0.5181 | 0.4019 | 0.2683 |
0.0093 | 57.0 | 14250 | 3.9614 | 0.5323 | 0.8240 | 4.1335 | 0.5323 | 0.5265 | 0.3863 | 0.2517 |
0.0082 | 58.0 | 14500 | 3.9553 | 0.548 | 0.8125 | 4.0319 | 0.548 | 0.5412 | 0.3822 | 0.2414 |
0.0082 | 59.0 | 14750 | 3.9586 | 0.5335 | 0.8325 | 4.0338 | 0.5335 | 0.5314 | 0.3902 | 0.2582 |
0.0069 | 60.0 | 15000 | 4.1072 | 0.531 | 0.8422 | 4.0678 | 0.531 | 0.5250 | 0.3997 | 0.2574 |
0.0069 | 61.0 | 15250 | 4.0455 | 0.5425 | 0.8173 | 4.0318 | 0.5425 | 0.5415 | 0.3881 | 0.2480 |
0.0054 | 62.0 | 15500 | 4.0208 | 0.531 | 0.8325 | 4.1704 | 0.531 | 0.5261 | 0.3912 | 0.2517 |
0.0054 | 63.0 | 15750 | 4.1167 | 0.5345 | 0.8325 | 4.2352 | 0.5345 | 0.5292 | 0.3926 | 0.2537 |
0.0054 | 64.0 | 16000 | 4.0246 | 0.5323 | 0.8339 | 4.0084 | 0.5323 | 0.5319 | 0.3940 | 0.2536 |
0.0054 | 65.0 | 16250 | 4.0535 | 0.5417 | 0.8203 | 4.1167 | 0.5417 | 0.5340 | 0.3875 | 0.2464 |
0.0048 | 66.0 | 16500 | 4.1987 | 0.5325 | 0.8371 | 4.2901 | 0.5325 | 0.5215 | 0.3979 | 0.2529 |
0.0048 | 67.0 | 16750 | 4.0956 | 0.5355 | 0.8264 | 4.3477 | 0.5355 | 0.5239 | 0.3889 | 0.2449 |
0.004 | 68.0 | 17000 | 3.9999 | 0.5423 | 0.8186 | 4.0645 | 0.5423 | 0.5453 | 0.3877 | 0.2487 |
0.004 | 69.0 | 17250 | 4.0824 | 0.538 | 0.8229 | 4.1670 | 0.538 | 0.5350 | 0.3887 | 0.2461 |
0.0053 | 70.0 | 17500 | 4.2158 | 0.5305 | 0.8479 | 4.2136 | 0.5305 | 0.5287 | 0.4002 | 0.2572 |
0.0053 | 71.0 | 17750 | 4.1586 | 0.533 | 0.8355 | 4.1576 | 0.533 | 0.5261 | 0.3942 | 0.2512 |
0.0041 | 72.0 | 18000 | 4.0781 | 0.5375 | 0.8296 | 4.1218 | 0.5375 | 0.5341 | 0.3930 | 0.2427 |
0.0041 | 73.0 | 18250 | 4.1389 | 0.5413 | 0.8229 | 4.0890 | 0.5413 | 0.5347 | 0.3918 | 0.2437 |
0.0028 | 74.0 | 18500 | 4.0675 | 0.5415 | 0.8212 | 4.0429 | 0.5415 | 0.5404 | 0.3920 | 0.2415 |
0.0028 | 75.0 | 18750 | 4.1044 | 0.5377 | 0.8294 | 4.1268 | 0.5377 | 0.5335 | 0.3955 | 0.2439 |
0.0027 | 76.0 | 19000 | 4.0731 | 0.5435 | 0.8193 | 4.0913 | 0.5435 | 0.5396 | 0.3892 | 0.2411 |
0.0027 | 77.0 | 19250 | 4.0768 | 0.5455 | 0.8158 | 4.0784 | 0.5455 | 0.5398 | 0.3885 | 0.2389 |
0.0028 | 78.0 | 19500 | 4.0665 | 0.5447 | 0.8187 | 4.0719 | 0.5447 | 0.5390 | 0.3876 | 0.2392 |
0.0028 | 79.0 | 19750 | 4.0475 | 0.5413 | 0.8204 | 4.0408 | 0.5413 | 0.5361 | 0.3927 | 0.2376 |
0.0026 | 80.0 | 20000 | 4.0176 | 0.5457 | 0.8101 | 4.0504 | 0.5457 | 0.5424 | 0.3844 | 0.2376 |
0.0026 | 81.0 | 20250 | 4.0408 | 0.5427 | 0.8181 | 4.0458 | 0.5427 | 0.5385 | 0.3888 | 0.2385 |
0.0027 | 82.0 | 20500 | 4.0392 | 0.5427 | 0.8207 | 4.0317 | 0.5427 | 0.5387 | 0.3897 | 0.2392 |
0.0027 | 83.0 | 20750 | 4.0163 | 0.545 | 0.8145 | 4.0292 | 0.545 | 0.5403 | 0.3868 | 0.2375 |
0.0026 | 84.0 | 21000 | 4.0057 | 0.5437 | 0.8165 | 4.0096 | 0.5437 | 0.5404 | 0.3867 | 0.2380 |
0.0026 | 85.0 | 21250 | 4.0096 | 0.544 | 0.8140 | 4.0733 | 0.544 | 0.5404 | 0.3861 | 0.2368 |
0.0026 | 86.0 | 21500 | 3.9696 | 0.5487 | 0.8087 | 4.0527 | 0.5487 | 0.5435 | 0.3824 | 0.2352 |
0.0026 | 87.0 | 21750 | 3.9826 | 0.5495 | 0.8103 | 4.0353 | 0.5495 | 0.5460 | 0.3820 | 0.2362 |
0.0025 | 88.0 | 22000 | 4.0171 | 0.5455 | 0.8147 | 4.0540 | 0.5455 | 0.5402 | 0.3865 | 0.2359 |
0.0025 | 89.0 | 22250 | 3.9745 | 0.5455 | 0.8138 | 3.9683 | 0.5455 | 0.5439 | 0.3867 | 0.2357 |
0.0025 | 90.0 | 22500 | 3.9811 | 0.5473 | 0.8098 | 3.9749 | 0.5473 | 0.5437 | 0.3842 | 0.2346 |
0.0025 | 91.0 | 22750 | 3.9800 | 0.5475 | 0.8122 | 3.9502 | 0.5475 | 0.5450 | 0.3839 | 0.2353 |
0.0025 | 92.0 | 23000 | 3.9844 | 0.5473 | 0.8103 | 3.9825 | 0.5473 | 0.5425 | 0.3840 | 0.2347 |
0.0025 | 93.0 | 23250 | 3.9876 | 0.5485 | 0.8107 | 3.9624 | 0.5485 | 0.5441 | 0.3826 | 0.2343 |
0.0025 | 94.0 | 23500 | 3.9751 | 0.5485 | 0.8086 | 3.9791 | 0.5485 | 0.5450 | 0.3831 | 0.2337 |
0.0025 | 95.0 | 23750 | 3.9765 | 0.548 | 0.8087 | 3.9863 | 0.548 | 0.5440 | 0.3839 | 0.2336 |
0.0024 | 96.0 | 24000 | 3.9764 | 0.5507 | 0.8077 | 3.9676 | 0.5507 | 0.5473 | 0.3807 | 0.2339 |
0.0024 | 97.0 | 24250 | 3.9695 | 0.549 | 0.8082 | 3.9494 | 0.549 | 0.5456 | 0.3819 | 0.2346 |
0.0023 | 98.0 | 24500 | 3.9733 | 0.5497 | 0.8080 | 3.9599 | 0.5497 | 0.5462 | 0.3815 | 0.2338 |
0.0023 | 99.0 | 24750 | 3.9727 | 0.5505 | 0.8081 | 3.9563 | 0.5505 | 0.5469 | 0.3807 | 0.2339 |
0.0023 | 100.0 | 25000 | 3.9745 | 0.551 | 0.8083 | 3.9609 | 0.551 | 0.5474 | 0.3805 | 0.2338 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 3
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.