--- library_name: transformers base_model: aubmindlab/bert-base-arabertv02 tags: - generated_from_trainer model-index: - name: arabert_augWithOrig_disEqu_k5_organization_task1_fold0 results: [] --- # arabert_augWithOrig_disEqu_k5_organization_task1_fold0 This model is a fine-tuned version of [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.9590 - Qwk: 0.6495 - Mse: 0.9590 - Rmse: 0.9793 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Qwk | Mse | Rmse | |:-------------:|:------:|:----:|:---------------:|:-------:|:------:|:------:| | No log | 0.0253 | 2 | 3.0774 | -0.0262 | 3.0774 | 1.7542 | | No log | 0.0506 | 4 | 1.7580 | -0.0383 | 1.7580 | 1.3259 | | No log | 0.0759 | 6 | 1.1986 | 0.2566 | 1.1986 | 1.0948 | | No log | 0.1013 | 8 | 1.0874 | 0.3256 | 1.0874 | 1.0428 | | No log | 0.1266 | 10 | 0.9652 | 0.3686 | 0.9652 | 0.9824 | | No log | 0.1519 | 12 | 0.9302 | 0.3418 | 0.9302 | 0.9645 | | No log | 0.1772 | 14 | 1.0338 | 0.4126 | 1.0338 | 1.0168 | | No log | 0.2025 | 16 | 1.3351 | 0.4830 | 1.3351 | 1.1555 | | No log | 0.2278 | 18 | 1.6121 | 0.2416 | 1.6121 | 1.2697 | | No log | 0.2532 | 20 | 1.8478 | 0.2145 | 1.8478 | 1.3593 | | No log | 0.2785 | 22 | 1.8876 | 0.2145 | 1.8876 | 1.3739 | | No log | 0.3038 | 24 | 1.5407 | 0.2553 | 1.5407 | 1.2413 | | No log | 0.3291 | 26 | 1.0998 | 0.4085 | 1.0998 | 1.0487 | | No log | 0.3544 | 28 | 1.0849 | 0.4867 | 1.0849 | 1.0416 | | No log | 0.3797 | 30 | 1.2729 | 0.2910 | 1.2729 | 1.1282 | | No log | 0.4051 | 32 | 1.2804 | 0.3111 | 1.2804 | 1.1316 | | No log | 0.4304 | 34 | 1.2430 | 0.3623 | 1.2430 | 1.1149 | | No log | 0.4557 | 36 | 1.1185 | 0.3404 | 1.1185 | 1.0576 | | No log | 0.4810 | 38 | 1.0201 | 0.4919 | 1.0201 | 1.0100 | | No log | 0.5063 | 40 | 1.0123 | 0.4919 | 1.0123 | 1.0061 | | No log | 0.5316 | 42 | 1.2870 | 0.3298 | 1.2870 | 1.1345 | | No log | 0.5570 | 44 | 1.3774 | 0.1895 | 1.3774 | 1.1736 | | No log | 0.5823 | 46 | 1.2565 | 0.2168 | 1.2565 | 1.1209 | | No log | 0.6076 | 48 | 1.1703 | 0.2478 | 1.1703 | 1.0818 | | No log | 0.6329 | 50 | 1.1831 | 0.2478 | 1.1831 | 1.0877 | | No log | 0.6582 | 52 | 1.1976 | 0.1874 | 1.1976 | 1.0943 | | No log | 0.6835 | 54 | 1.1493 | 0.2788 | 1.1493 | 1.0720 | | No log | 0.7089 | 56 | 1.2647 | 0.4590 | 1.2647 | 1.1246 | | No log | 0.7342 | 58 | 1.5608 | 0.2793 | 1.5608 | 1.2493 | | No log | 0.7595 | 60 | 1.9550 | 0.1873 | 1.9550 | 1.3982 | | No log | 0.7848 | 62 | 2.2243 | 0.0801 | 2.2243 | 1.4914 | | No log | 0.8101 | 64 | 2.3243 | 0.0550 | 2.3243 | 1.5246 | | No log | 0.8354 | 66 | 2.2197 | 0.0050 | 2.2197 | 1.4899 | | No log | 0.8608 | 68 | 2.0592 | -0.0943 | 2.0592 | 1.4350 | | No log | 0.8861 | 70 | 1.9129 | -0.1867 | 1.9129 | 1.3831 | | No log | 0.9114 | 72 | 1.7453 | -0.0024 | 1.7453 | 1.3211 | | No log | 0.9367 | 74 | 1.5294 | 0.1497 | 1.5294 | 1.2367 | | No log | 0.9620 | 76 | 1.2431 | 0.4059 | 1.2431 | 1.1149 | | No log | 0.9873 | 78 | 0.9936 | 0.4355 | 0.9936 | 0.9968 | | No log | 1.0127 | 80 | 0.9899 | 0.5281 | 0.9899 | 0.9949 | | No log | 1.0380 | 82 | 0.9708 | 0.5281 | 0.9708 | 0.9853 | | No log | 1.0633 | 84 | 0.9069 | 0.4528 | 0.9069 | 0.9523 | | No log | 1.0886 | 86 | 0.8789 | 0.4712 | 0.8789 | 0.9375 | | No log | 1.1139 | 88 | 0.9539 | 0.5488 | 0.9539 | 0.9767 | | No log | 1.1392 | 90 | 1.2004 | 0.4071 | 1.2004 | 1.0956 | | No log | 1.1646 | 92 | 1.3421 | 0.3322 | 1.3421 | 1.1585 | | No log | 1.1899 | 94 | 1.4675 | 0.1893 | 1.4675 | 1.2114 | | No log | 1.2152 | 96 | 1.6380 | 0.2705 | 1.6380 | 1.2799 | | No log | 1.2405 | 98 | 1.6263 | 0.2705 | 1.6263 | 1.2753 | | No log | 1.2658 | 100 | 1.4874 | 0.2705 | 1.4874 | 1.2196 | | No log | 1.2911 | 102 | 1.1902 | 0.5051 | 1.1902 | 1.0910 | | No log | 1.3165 | 104 | 1.0158 | 0.5882 | 1.0158 | 1.0079 | | No log | 1.3418 | 106 | 1.0408 | 0.6155 | 1.0408 | 1.0202 | | No log | 1.3671 | 108 | 1.1637 | 0.6280 | 1.1637 | 1.0788 | | No log | 1.3924 | 110 | 1.4564 | 0.4581 | 1.4564 | 1.2068 | | No log | 1.4177 | 112 | 1.5144 | 0.3837 | 1.5144 | 1.2306 | | No log | 1.4430 | 114 | 1.4876 | 0.2435 | 1.4876 | 1.2197 | | No log | 1.4684 | 116 | 1.3089 | 0.3322 | 1.3089 | 1.1441 | | No log | 1.4937 | 118 | 1.0499 | 0.4845 | 1.0499 | 1.0247 | | No log | 1.5190 | 120 | 0.8543 | 0.5645 | 0.8543 | 0.9243 | | No log | 1.5443 | 122 | 0.8381 | 0.5645 | 0.8381 | 0.9155 | | No log | 1.5696 | 124 | 0.9010 | 0.4878 | 0.9010 | 0.9492 | | No log | 1.5949 | 126 | 1.1169 | 0.4830 | 1.1169 | 1.0568 | | No log | 1.6203 | 128 | 1.1447 | 0.4830 | 1.1447 | 1.0699 | | No log | 1.6456 | 130 | 1.0706 | 0.5542 | 1.0706 | 1.0347 | | No log | 1.6709 | 132 | 0.8394 | 0.5645 | 0.8394 | 0.9162 | | No log | 1.6962 | 134 | 0.8513 | 0.5645 | 0.8513 | 0.9226 | | No log | 1.7215 | 136 | 0.9252 | 0.6254 | 0.9252 | 0.9619 | | No log | 1.7468 | 138 | 1.0116 | 0.7337 | 1.0116 | 1.0058 | | No log | 1.7722 | 140 | 0.9960 | 0.7337 | 0.9960 | 0.9980 | | No log | 1.7975 | 142 | 1.2380 | 0.5477 | 1.2380 | 1.1127 | | No log | 1.8228 | 144 | 1.3091 | 0.5688 | 1.3091 | 1.1442 | | No log | 1.8481 | 146 | 1.3146 | 0.5688 | 1.3146 | 1.1466 | | No log | 1.8734 | 148 | 1.2107 | 0.5273 | 1.2107 | 1.1003 | | No log | 1.8987 | 150 | 1.0837 | 0.4167 | 1.0837 | 1.0410 | | No log | 1.9241 | 152 | 0.9885 | 0.5471 | 0.9885 | 0.9943 | | No log | 1.9494 | 154 | 0.8796 | 0.6775 | 0.8796 | 0.9379 | | No log | 1.9747 | 156 | 0.8672 | 0.6775 | 0.8672 | 0.9312 | | No log | 2.0 | 158 | 0.8542 | 0.7078 | 0.8542 | 0.9242 | | No log | 2.0253 | 160 | 0.7751 | 0.6552 | 0.7751 | 0.8804 | | No log | 2.0506 | 162 | 0.7942 | 0.5904 | 0.7942 | 0.8912 | | No log | 2.0759 | 164 | 0.8465 | 0.5966 | 0.8465 | 0.9201 | | No log | 2.1013 | 166 | 0.8012 | 0.6038 | 0.8012 | 0.8951 | | No log | 2.1266 | 168 | 0.7811 | 0.6032 | 0.7811 | 0.8838 | | No log | 2.1519 | 170 | 0.8591 | 0.6937 | 0.8591 | 0.9269 | | No log | 2.1772 | 172 | 0.9255 | 0.6477 | 0.9255 | 0.9620 | | No log | 2.2025 | 174 | 0.8632 | 0.6769 | 0.8632 | 0.9291 | | No log | 2.2278 | 176 | 0.8458 | 0.6567 | 0.8458 | 0.9197 | | No log | 2.2532 | 178 | 0.8685 | 0.6441 | 0.8685 | 0.9319 | | No log | 2.2785 | 180 | 0.8494 | 0.6243 | 0.8494 | 0.9216 | | No log | 2.3038 | 182 | 0.8104 | 0.6445 | 0.8104 | 0.9002 | | No log | 2.3291 | 184 | 0.7472 | 0.6202 | 0.7472 | 0.8644 | | No log | 2.3544 | 186 | 0.6915 | 0.6567 | 0.6915 | 0.8316 | | No log | 2.3797 | 188 | 0.8101 | 0.7154 | 0.8101 | 0.9001 | | No log | 2.4051 | 190 | 0.9747 | 0.5882 | 0.9747 | 0.9873 | | No log | 2.4304 | 192 | 1.0964 | 0.6084 | 1.0964 | 1.0471 | | No log | 2.4557 | 194 | 1.1503 | 0.6690 | 1.1503 | 1.0725 | | No log | 2.4810 | 196 | 1.1398 | 0.6690 | 1.1398 | 1.0676 | | No log | 2.5063 | 198 | 0.9993 | 0.7521 | 0.9993 | 0.9997 | | No log | 2.5316 | 200 | 0.9294 | 0.6842 | 0.9294 | 0.9641 | | No log | 2.5570 | 202 | 0.8507 | 0.6254 | 0.8507 | 0.9223 | | No log | 2.5823 | 204 | 0.7773 | 0.6588 | 0.7773 | 0.8816 | | No log | 2.6076 | 206 | 0.7322 | 0.6533 | 0.7322 | 0.8557 | | No log | 2.6329 | 208 | 0.7458 | 0.6893 | 0.7458 | 0.8636 | | No log | 2.6582 | 210 | 0.8646 | 0.7337 | 0.8646 | 0.9298 | | No log | 2.6835 | 212 | 1.1840 | 0.6387 | 1.1840 | 1.0881 | | No log | 2.7089 | 214 | 1.4548 | 0.5243 | 1.4548 | 1.2062 | | No log | 2.7342 | 216 | 1.6320 | 0.4284 | 1.6320 | 1.2775 | | No log | 2.7595 | 218 | 1.6462 | 0.2957 | 1.6462 | 1.2830 | | No log | 2.7848 | 220 | 1.5380 | 0.2957 | 1.5380 | 1.2401 | | No log | 2.8101 | 222 | 1.3049 | 0.5035 | 1.3049 | 1.1423 | | No log | 2.8354 | 224 | 1.0489 | 0.6174 | 1.0489 | 1.0242 | | No log | 2.8608 | 226 | 0.8360 | 0.7037 | 0.8360 | 0.9143 | | No log | 2.8861 | 228 | 0.7461 | 0.6937 | 0.7461 | 0.8638 | | No log | 2.9114 | 230 | 0.7432 | 0.6823 | 0.7432 | 0.8621 | | No log | 2.9367 | 232 | 0.7739 | 0.6823 | 0.7739 | 0.8797 | | No log | 2.9620 | 234 | 0.7990 | 0.7682 | 0.7990 | 0.8939 | | No log | 2.9873 | 236 | 0.8627 | 0.6170 | 0.8627 | 0.9288 | | No log | 3.0127 | 238 | 0.9529 | 0.6515 | 0.9529 | 0.9762 | | No log | 3.0380 | 240 | 0.9967 | 0.6515 | 0.9967 | 0.9984 | | No log | 3.0633 | 242 | 1.0102 | 0.6515 | 1.0102 | 1.0051 | | No log | 3.0886 | 244 | 0.9726 | 0.6515 | 0.9726 | 0.9862 | | No log | 3.1139 | 246 | 0.9170 | 0.7068 | 0.9170 | 0.9576 | | No log | 3.1392 | 248 | 0.9438 | 0.6515 | 0.9438 | 0.9715 | | No log | 3.1646 | 250 | 1.0382 | 0.6189 | 1.0382 | 1.0189 | | No log | 3.1899 | 252 | 1.0078 | 0.6515 | 1.0078 | 1.0039 | | No log | 3.2152 | 254 | 0.9682 | 0.7521 | 0.9682 | 0.9840 | | No log | 3.2405 | 256 | 0.9392 | 0.7068 | 0.9392 | 0.9691 | | No log | 3.2658 | 258 | 1.0439 | 0.6189 | 1.0439 | 1.0217 | | No log | 3.2911 | 260 | 1.1935 | 0.5688 | 1.1935 | 1.0925 | | No log | 3.3165 | 262 | 1.3865 | 0.3802 | 1.3865 | 1.1775 | | No log | 3.3418 | 264 | 1.4320 | 0.4081 | 1.4320 | 1.1967 | | No log | 3.3671 | 266 | 1.3977 | 0.5010 | 1.3977 | 1.1823 | | No log | 3.3924 | 268 | 1.2084 | 0.6272 | 1.2084 | 1.0993 | | No log | 3.4177 | 270 | 1.0299 | 0.7358 | 1.0299 | 1.0149 | | No log | 3.4430 | 272 | 0.8702 | 0.6755 | 0.8702 | 0.9329 | | No log | 3.4684 | 274 | 0.7982 | 0.6755 | 0.7982 | 0.8934 | | No log | 3.4937 | 276 | 0.7652 | 0.6818 | 0.7652 | 0.8747 | | No log | 3.5190 | 278 | 0.7618 | 0.6552 | 0.7618 | 0.8728 | | No log | 3.5443 | 280 | 0.7834 | 0.7038 | 0.7834 | 0.8851 | | No log | 3.5696 | 282 | 0.8455 | 0.7105 | 0.8455 | 0.9195 | | No log | 3.5949 | 284 | 0.9339 | 0.7063 | 0.9339 | 0.9664 | | No log | 3.6203 | 286 | 1.0177 | 0.6623 | 1.0177 | 1.0088 | | No log | 3.6456 | 288 | 1.0926 | 0.6623 | 1.0926 | 1.0453 | | No log | 3.6709 | 290 | 1.0955 | 0.6623 | 1.0955 | 1.0467 | | No log | 3.6962 | 292 | 1.0092 | 0.6623 | 1.0092 | 1.0046 | | No log | 3.7215 | 294 | 0.9854 | 0.7244 | 0.9854 | 0.9927 | | No log | 3.7468 | 296 | 0.9994 | 0.5931 | 0.9994 | 0.9997 | | No log | 3.7722 | 298 | 1.0171 | 0.6105 | 1.0171 | 1.0085 | | No log | 3.7975 | 300 | 1.0805 | 0.6509 | 1.0805 | 1.0395 | | No log | 3.8228 | 302 | 1.1836 | 0.6369 | 1.1836 | 1.0879 | | No log | 3.8481 | 304 | 1.1536 | 0.6322 | 1.1536 | 1.0741 | | No log | 3.8734 | 306 | 1.1137 | 0.6322 | 1.1137 | 1.0553 | | No log | 3.8987 | 308 | 1.0252 | 0.6459 | 1.0252 | 1.0125 | | No log | 3.9241 | 310 | 0.9462 | 0.7033 | 0.9462 | 0.9728 | | No log | 3.9494 | 312 | 0.8855 | 0.6970 | 0.8855 | 0.9410 | | No log | 3.9747 | 314 | 0.8449 | 0.6495 | 0.8449 | 0.9192 | | No log | 4.0 | 316 | 0.8243 | 0.6495 | 0.8243 | 0.9079 | | No log | 4.0253 | 318 | 0.8290 | 0.6937 | 0.8290 | 0.9105 | | No log | 4.0506 | 320 | 0.8328 | 0.5653 | 0.8328 | 0.9126 | | No log | 4.0759 | 322 | 0.8381 | 0.6067 | 0.8381 | 0.9155 | | No log | 4.1013 | 324 | 0.7976 | 0.6655 | 0.7976 | 0.8931 | | No log | 4.1266 | 326 | 0.8074 | 0.6541 | 0.8074 | 0.8985 | | No log | 4.1519 | 328 | 0.8950 | 0.7196 | 0.8950 | 0.9460 | | No log | 4.1772 | 330 | 1.0172 | 0.7358 | 1.0172 | 1.0086 | | No log | 4.2025 | 332 | 1.1485 | 0.6856 | 1.1485 | 1.0717 | | No log | 4.2278 | 334 | 1.2270 | 0.6514 | 1.2270 | 1.1077 | | No log | 4.2532 | 336 | 1.1784 | 0.6322 | 1.1784 | 1.0856 | | No log | 4.2785 | 338 | 1.0846 | 0.6077 | 1.0846 | 1.0414 | | No log | 4.3038 | 340 | 1.0185 | 0.5471 | 1.0185 | 1.0092 | | No log | 4.3291 | 342 | 0.9683 | 0.4878 | 0.9683 | 0.9840 | | No log | 4.3544 | 344 | 0.9361 | 0.6888 | 0.9361 | 0.9675 | | No log | 4.3797 | 346 | 0.9427 | 0.7095 | 0.9427 | 0.9709 | | No log | 4.4051 | 348 | 0.9045 | 0.7358 | 0.9045 | 0.9510 | | No log | 4.4304 | 350 | 0.8128 | 0.7128 | 0.8128 | 0.9016 | | No log | 4.4557 | 352 | 0.7796 | 0.6603 | 0.7796 | 0.8830 | | No log | 4.4810 | 354 | 0.7578 | 0.6148 | 0.7578 | 0.8705 | | No log | 4.5063 | 356 | 0.7528 | 0.6230 | 0.7528 | 0.8676 | | No log | 4.5316 | 358 | 0.7516 | 0.6230 | 0.7516 | 0.8669 | | No log | 4.5570 | 360 | 0.7515 | 0.6268 | 0.7515 | 0.8669 | | No log | 4.5823 | 362 | 0.7758 | 0.6218 | 0.7758 | 0.8808 | | No log | 4.6076 | 364 | 0.8315 | 0.6764 | 0.8315 | 0.9119 | | No log | 4.6329 | 366 | 0.9219 | 0.7763 | 0.9219 | 0.9602 | | No log | 4.6582 | 368 | 0.9369 | 0.7369 | 0.9369 | 0.9679 | | No log | 4.6835 | 370 | 0.9325 | 0.7921 | 0.9325 | 0.9657 | | No log | 4.7089 | 372 | 0.8948 | 0.7441 | 0.8948 | 0.9460 | | No log | 4.7342 | 374 | 0.8757 | 0.7441 | 0.8757 | 0.9358 | | No log | 4.7595 | 376 | 0.8680 | 0.7176 | 0.8680 | 0.9317 | | No log | 4.7848 | 378 | 0.9106 | 0.7441 | 0.9106 | 0.9543 | | No log | 4.8101 | 380 | 0.9780 | 0.6882 | 0.9780 | 0.9889 | | No log | 4.8354 | 382 | 1.0101 | 0.6329 | 1.0101 | 1.0050 | | No log | 4.8608 | 384 | 1.0288 | 0.6509 | 1.0288 | 1.0143 | | No log | 4.8861 | 386 | 1.0162 | 0.7610 | 1.0162 | 1.0080 | | No log | 4.9114 | 388 | 0.9355 | 0.7196 | 0.9355 | 0.9672 | | No log | 4.9367 | 390 | 0.8519 | 0.6708 | 0.8519 | 0.9230 | | No log | 4.9620 | 392 | 0.8430 | 0.6429 | 0.8430 | 0.9181 | | No log | 4.9873 | 394 | 0.8812 | 0.6429 | 0.8812 | 0.9387 | | No log | 5.0127 | 396 | 0.9423 | 0.7176 | 0.9423 | 0.9707 | | No log | 5.0380 | 398 | 1.0394 | 0.6002 | 1.0394 | 1.0195 | | No log | 5.0633 | 400 | 1.0831 | 0.6189 | 1.0831 | 1.0407 | | No log | 5.0886 | 402 | 1.0745 | 0.6189 | 1.0745 | 1.0366 | | No log | 5.1139 | 404 | 1.1146 | 0.6509 | 1.1146 | 1.0558 | | No log | 5.1392 | 406 | 1.1946 | 0.6322 | 1.1946 | 1.0930 | | No log | 5.1646 | 408 | 1.2182 | 0.6322 | 1.2182 | 1.1037 | | No log | 5.1899 | 410 | 1.1550 | 0.5995 | 1.1550 | 1.0747 | | No log | 5.2152 | 412 | 1.1043 | 0.5808 | 1.1043 | 1.0508 | | No log | 5.2405 | 414 | 1.0132 | 0.5938 | 1.0132 | 1.0066 | | No log | 5.2658 | 416 | 0.9537 | 0.6764 | 0.9537 | 0.9766 | | No log | 5.2911 | 418 | 0.9380 | 0.6708 | 0.9380 | 0.9685 | | No log | 5.3165 | 420 | 0.8962 | 0.6218 | 0.8962 | 0.9467 | | No log | 5.3418 | 422 | 0.8855 | 0.6218 | 0.8855 | 0.9410 | | No log | 5.3671 | 424 | 0.9337 | 0.73 | 0.9337 | 0.9663 | | No log | 5.3924 | 426 | 0.9475 | 0.73 | 0.9475 | 0.9734 | | No log | 5.4177 | 428 | 0.8997 | 0.6852 | 0.8997 | 0.9485 | | No log | 5.4430 | 430 | 0.8913 | 0.6801 | 0.8913 | 0.9441 | | No log | 5.4684 | 432 | 0.8604 | 0.6451 | 0.8604 | 0.9276 | | No log | 5.4937 | 434 | 0.8088 | 0.6451 | 0.8088 | 0.8993 | | No log | 5.5190 | 436 | 0.7967 | 0.6451 | 0.7967 | 0.8926 | | No log | 5.5443 | 438 | 0.8154 | 0.6451 | 0.8154 | 0.9030 | | No log | 5.5696 | 440 | 0.8181 | 0.6451 | 0.8181 | 0.9045 | | No log | 5.5949 | 442 | 0.8514 | 0.6451 | 0.8514 | 0.9227 | | No log | 5.6203 | 444 | 0.9368 | 0.73 | 0.9368 | 0.9679 | | No log | 5.6456 | 446 | 1.0100 | 0.7051 | 1.0100 | 1.0050 | | No log | 5.6709 | 448 | 1.0098 | 0.7051 | 1.0098 | 1.0049 | | No log | 5.6962 | 450 | 0.9977 | 0.7439 | 0.9977 | 0.9989 | | No log | 5.7215 | 452 | 1.0071 | 0.6927 | 1.0071 | 1.0035 | | No log | 5.7468 | 454 | 0.9490 | 0.7196 | 0.9490 | 0.9742 | | No log | 5.7722 | 456 | 0.9163 | 0.7196 | 0.9163 | 0.9572 | | No log | 5.7975 | 458 | 0.9075 | 0.6764 | 0.9075 | 0.9526 | | No log | 5.8228 | 460 | 0.9687 | 0.6576 | 0.9687 | 0.9843 | | No log | 5.8481 | 462 | 1.0109 | 0.6239 | 1.0109 | 1.0054 | | No log | 5.8734 | 464 | 1.0419 | 0.5453 | 1.0419 | 1.0207 | | No log | 5.8987 | 466 | 1.0707 | 0.5655 | 1.0707 | 1.0347 | | No log | 5.9241 | 468 | 1.0540 | 0.6427 | 1.0540 | 1.0266 | | No log | 5.9494 | 470 | 0.9815 | 0.6515 | 0.9815 | 0.9907 | | No log | 5.9747 | 472 | 0.9342 | 0.6932 | 0.9342 | 0.9665 | | No log | 6.0 | 474 | 0.9040 | 0.6932 | 0.9040 | 0.9508 | | No log | 6.0253 | 476 | 0.8773 | 0.7181 | 0.8773 | 0.9367 | | No log | 6.0506 | 478 | 0.8756 | 0.7109 | 0.8756 | 0.9357 | | No log | 6.0759 | 480 | 0.9127 | 0.6764 | 0.9127 | 0.9553 | | No log | 6.1013 | 482 | 0.9753 | 0.6471 | 0.9753 | 0.9876 | | No log | 6.1266 | 484 | 1.0581 | 0.6748 | 1.0581 | 1.0287 | | No log | 6.1519 | 486 | 1.0953 | 0.6917 | 1.0953 | 1.0465 | | No log | 6.1772 | 488 | 1.0975 | 0.6748 | 1.0975 | 1.0476 | | No log | 6.2025 | 490 | 1.0294 | 0.7024 | 1.0294 | 1.0146 | | No log | 6.2278 | 492 | 0.9185 | 0.6970 | 0.9185 | 0.9584 | | No log | 6.2532 | 494 | 0.8269 | 0.6602 | 0.8269 | 0.9093 | | No log | 6.2785 | 496 | 0.7860 | 0.6268 | 0.7860 | 0.8866 | | No log | 6.3038 | 498 | 0.7788 | 0.6268 | 0.7788 | 0.8825 | | 0.6475 | 6.3291 | 500 | 0.7878 | 0.6268 | 0.7878 | 0.8876 | | 0.6475 | 6.3544 | 502 | 0.8358 | 0.6602 | 0.8358 | 0.9142 | | 0.6475 | 6.3797 | 504 | 0.9304 | 0.7363 | 0.9304 | 0.9646 | | 0.6475 | 6.4051 | 506 | 1.0355 | 0.7350 | 1.0355 | 1.0176 | | 0.6475 | 6.4304 | 508 | 1.0617 | 0.7350 | 1.0617 | 1.0304 | | 0.6475 | 6.4557 | 510 | 1.0607 | 0.7597 | 1.0607 | 1.0299 | | 0.6475 | 6.4810 | 512 | 1.0094 | 0.7208 | 1.0094 | 1.0047 | | 0.6475 | 6.5063 | 514 | 0.9367 | 0.6970 | 0.9367 | 0.9679 | | 0.6475 | 6.5316 | 516 | 0.8781 | 0.6970 | 0.8781 | 0.9371 | | 0.6475 | 6.5570 | 518 | 0.8437 | 0.6875 | 0.8437 | 0.9185 | | 0.6475 | 6.5823 | 520 | 0.8290 | 0.6552 | 0.8290 | 0.9105 | | 0.6475 | 6.6076 | 522 | 0.8216 | 0.6552 | 0.8216 | 0.9064 | | 0.6475 | 6.6329 | 524 | 0.8269 | 0.6552 | 0.8269 | 0.9093 | | 0.6475 | 6.6582 | 526 | 0.8268 | 0.6552 | 0.8268 | 0.9093 | | 0.6475 | 6.6835 | 528 | 0.8331 | 0.6552 | 0.8331 | 0.9127 | | 0.6475 | 6.7089 | 530 | 0.8534 | 0.6552 | 0.8534 | 0.9238 | | 0.6475 | 6.7342 | 532 | 0.9045 | 0.7109 | 0.9045 | 0.9510 | | 0.6475 | 6.7595 | 534 | 0.9748 | 0.6698 | 0.9748 | 0.9873 | | 0.6475 | 6.7848 | 536 | 1.0814 | 0.6807 | 1.0814 | 1.0399 | | 0.6475 | 6.8101 | 538 | 1.1870 | 0.6616 | 1.1870 | 1.0895 | | 0.6475 | 6.8354 | 540 | 1.2155 | 0.6265 | 1.2155 | 1.1025 | | 0.6475 | 6.8608 | 542 | 1.1775 | 0.6265 | 1.1775 | 1.0851 | | 0.6475 | 6.8861 | 544 | 1.0887 | 0.6265 | 1.0887 | 1.0434 | | 0.6475 | 6.9114 | 546 | 0.9836 | 0.6376 | 0.9836 | 0.9918 | | 0.6475 | 6.9367 | 548 | 0.8985 | 0.6303 | 0.8985 | 0.9479 | | 0.6475 | 6.9620 | 550 | 0.8769 | 0.6303 | 0.8769 | 0.9365 | | 0.6475 | 6.9873 | 552 | 0.8699 | 0.6596 | 0.8699 | 0.9327 | | 0.6475 | 7.0127 | 554 | 0.8594 | 0.6875 | 0.8594 | 0.9271 | | 0.6475 | 7.0380 | 556 | 0.8826 | 0.6875 | 0.8826 | 0.9395 | | 0.6475 | 7.0633 | 558 | 0.9297 | 0.6703 | 0.9297 | 0.9642 | | 0.6475 | 7.0886 | 560 | 0.9628 | 0.6927 | 0.9628 | 0.9812 | | 0.6475 | 7.1139 | 562 | 0.9754 | 0.6759 | 0.9754 | 0.9876 | | 0.6475 | 7.1392 | 564 | 0.9448 | 0.6970 | 0.9448 | 0.9720 | | 0.6475 | 7.1646 | 566 | 0.9051 | 0.6495 | 0.9051 | 0.9514 | | 0.6475 | 7.1899 | 568 | 0.8463 | 0.6818 | 0.8463 | 0.9199 | | 0.6475 | 7.2152 | 570 | 0.8162 | 0.6818 | 0.8162 | 0.9034 | | 0.6475 | 7.2405 | 572 | 0.8082 | 0.6818 | 0.8082 | 0.8990 | | 0.6475 | 7.2658 | 574 | 0.8249 | 0.6818 | 0.8249 | 0.9082 | | 0.6475 | 7.2911 | 576 | 0.8418 | 0.6818 | 0.8418 | 0.9175 | | 0.6475 | 7.3165 | 578 | 0.8674 | 0.6451 | 0.8674 | 0.9314 | | 0.6475 | 7.3418 | 580 | 0.9299 | 0.73 | 0.9299 | 0.9643 | | 0.6475 | 7.3671 | 582 | 0.9911 | 0.7123 | 0.9911 | 0.9955 | | 0.6475 | 7.3924 | 584 | 1.0459 | 0.5954 | 1.0459 | 1.0227 | | 0.6475 | 7.4177 | 586 | 1.0527 | 0.6023 | 1.0527 | 1.0260 | | 0.6475 | 7.4430 | 588 | 1.0343 | 0.6824 | 1.0343 | 1.0170 | | 0.6475 | 7.4684 | 590 | 1.0249 | 0.5697 | 1.0249 | 1.0124 | | 0.6475 | 7.4937 | 592 | 1.0292 | 0.6824 | 1.0292 | 1.0145 | | 0.6475 | 7.5190 | 594 | 1.0448 | 0.6225 | 1.0448 | 1.0221 | | 0.6475 | 7.5443 | 596 | 1.0454 | 0.6807 | 1.0454 | 1.0224 | | 0.6475 | 7.5696 | 598 | 1.0315 | 0.7259 | 1.0315 | 1.0156 | | 0.6475 | 7.5949 | 600 | 1.0074 | 0.7522 | 1.0074 | 1.0037 | | 0.6475 | 7.6203 | 602 | 0.9401 | 0.7448 | 0.9401 | 0.9696 | | 0.6475 | 7.6456 | 604 | 0.8874 | 0.6944 | 0.8874 | 0.9420 | | 0.6475 | 7.6709 | 606 | 0.8569 | 0.6944 | 0.8569 | 0.9257 | | 0.6475 | 7.6962 | 608 | 0.8374 | 0.6603 | 0.8374 | 0.9151 | | 0.6475 | 7.7215 | 610 | 0.8485 | 0.6603 | 0.8485 | 0.9211 | | 0.6475 | 7.7468 | 612 | 0.8913 | 0.6603 | 0.8913 | 0.9441 | | 0.6475 | 7.7722 | 614 | 0.9285 | 0.7064 | 0.9285 | 0.9636 | | 0.6475 | 7.7975 | 616 | 0.9821 | 0.7522 | 0.9821 | 0.9910 | | 0.6475 | 7.8228 | 618 | 1.0064 | 0.7369 | 1.0064 | 1.0032 | | 0.6475 | 7.8481 | 620 | 0.9939 | 0.7033 | 0.9939 | 0.9969 | | 0.6475 | 7.8734 | 622 | 0.9851 | 0.6764 | 0.9851 | 0.9925 | | 0.6475 | 7.8987 | 624 | 0.9784 | 0.6256 | 0.9784 | 0.9891 | | 0.6475 | 7.9241 | 626 | 0.9749 | 0.6256 | 0.9749 | 0.9874 | | 0.6475 | 7.9494 | 628 | 0.9832 | 0.6256 | 0.9832 | 0.9916 | | 0.6475 | 7.9747 | 630 | 0.9848 | 0.6764 | 0.9848 | 0.9924 | | 0.6475 | 8.0 | 632 | 0.9866 | 0.7033 | 0.9866 | 0.9933 | | 0.6475 | 8.0253 | 634 | 0.9796 | 0.7033 | 0.9796 | 0.9897 | | 0.6475 | 8.0506 | 636 | 0.9937 | 0.7033 | 0.9937 | 0.9969 | | 0.6475 | 8.0759 | 638 | 1.0059 | 0.7369 | 1.0059 | 1.0030 | | 0.6475 | 8.1013 | 640 | 1.0098 | 0.7369 | 1.0098 | 1.0049 | | 0.6475 | 8.1266 | 642 | 1.0186 | 0.7369 | 1.0186 | 1.0092 | | 0.6475 | 8.1519 | 644 | 1.0179 | 0.7033 | 1.0179 | 1.0089 | | 0.6475 | 8.1772 | 646 | 1.0015 | 0.5938 | 1.0015 | 1.0008 | | 0.6475 | 8.2025 | 648 | 0.9891 | 0.5938 | 0.9891 | 0.9946 | | 0.6475 | 8.2278 | 650 | 0.9891 | 0.5938 | 0.9891 | 0.9945 | | 0.6475 | 8.2532 | 652 | 0.9828 | 0.625 | 0.9828 | 0.9913 | | 0.6475 | 8.2785 | 654 | 0.9739 | 0.6541 | 0.9739 | 0.9869 | | 0.6475 | 8.3038 | 656 | 0.9778 | 0.6703 | 0.9778 | 0.9888 | | 0.6475 | 8.3291 | 658 | 0.9633 | 0.6703 | 0.9633 | 0.9815 | | 0.6475 | 8.3544 | 660 | 0.9477 | 0.6703 | 0.9477 | 0.9735 | | 0.6475 | 8.3797 | 662 | 0.9298 | 0.6651 | 0.9298 | 0.9643 | | 0.6475 | 8.4051 | 664 | 0.9109 | 0.6651 | 0.9109 | 0.9544 | | 0.6475 | 8.4304 | 666 | 0.9053 | 0.6651 | 0.9053 | 0.9515 | | 0.6475 | 8.4557 | 668 | 0.9045 | 0.6495 | 0.9045 | 0.9511 | | 0.6475 | 8.4810 | 670 | 0.8986 | 0.6218 | 0.8986 | 0.9479 | | 0.6475 | 8.5063 | 672 | 0.9052 | 0.6495 | 0.9052 | 0.9514 | | 0.6475 | 8.5316 | 674 | 0.9173 | 0.6495 | 0.9173 | 0.9578 | | 0.6475 | 8.5570 | 676 | 0.9297 | 0.6495 | 0.9297 | 0.9642 | | 0.6475 | 8.5823 | 678 | 0.9425 | 0.6852 | 0.9425 | 0.9708 | | 0.6475 | 8.6076 | 680 | 0.9569 | 0.73 | 0.9569 | 0.9782 | | 0.6475 | 8.6329 | 682 | 0.9643 | 0.73 | 0.9643 | 0.9820 | | 0.6475 | 8.6582 | 684 | 0.9637 | 0.73 | 0.9637 | 0.9817 | | 0.6475 | 8.6835 | 686 | 0.9536 | 0.73 | 0.9536 | 0.9765 | | 0.6475 | 8.7089 | 688 | 0.9387 | 0.6852 | 0.9387 | 0.9689 | | 0.6475 | 8.7342 | 690 | 0.9238 | 0.6852 | 0.9238 | 0.9612 | | 0.6475 | 8.7595 | 692 | 0.9180 | 0.6603 | 0.9180 | 0.9581 | | 0.6475 | 8.7848 | 694 | 0.9193 | 0.6603 | 0.9193 | 0.9588 | | 0.6475 | 8.8101 | 696 | 0.9252 | 0.6218 | 0.9252 | 0.9619 | | 0.6475 | 8.8354 | 698 | 0.9294 | 0.6218 | 0.9294 | 0.9640 | | 0.6475 | 8.8608 | 700 | 0.9273 | 0.6218 | 0.9273 | 0.9630 | | 0.6475 | 8.8861 | 702 | 0.9239 | 0.6218 | 0.9239 | 0.9612 | | 0.6475 | 8.9114 | 704 | 0.9340 | 0.6218 | 0.9340 | 0.9664 | | 0.6475 | 8.9367 | 706 | 0.9496 | 0.6495 | 0.9496 | 0.9745 | | 0.6475 | 8.9620 | 708 | 0.9719 | 0.6970 | 0.9719 | 0.9859 | | 0.6475 | 8.9873 | 710 | 0.9858 | 0.6970 | 0.9858 | 0.9929 | | 0.6475 | 9.0127 | 712 | 0.9863 | 0.6970 | 0.9863 | 0.9931 | | 0.6475 | 9.0380 | 714 | 0.9862 | 0.6970 | 0.9862 | 0.9931 | | 0.6475 | 9.0633 | 716 | 0.9934 | 0.73 | 0.9934 | 0.9967 | | 0.6475 | 9.0886 | 718 | 0.9972 | 0.73 | 0.9972 | 0.9986 | | 0.6475 | 9.1139 | 720 | 1.0048 | 0.73 | 1.0048 | 1.0024 | | 0.6475 | 9.1392 | 722 | 1.0198 | 0.7369 | 1.0198 | 1.0099 | | 0.6475 | 9.1646 | 724 | 1.0231 | 0.7369 | 1.0231 | 1.0115 | | 0.6475 | 9.1899 | 726 | 1.0172 | 0.73 | 1.0172 | 1.0086 | | 0.6475 | 9.2152 | 728 | 1.0118 | 0.7369 | 1.0118 | 1.0059 | | 0.6475 | 9.2405 | 730 | 1.0126 | 0.7369 | 1.0126 | 1.0063 | | 0.6475 | 9.2658 | 732 | 1.0175 | 0.7369 | 1.0175 | 1.0087 | | 0.6475 | 9.2911 | 734 | 1.0160 | 0.7033 | 1.0160 | 1.0080 | | 0.6475 | 9.3165 | 736 | 1.0142 | 0.7033 | 1.0142 | 1.0071 | | 0.6475 | 9.3418 | 738 | 1.0153 | 0.7033 | 1.0153 | 1.0076 | | 0.6475 | 9.3671 | 740 | 1.0144 | 0.6759 | 1.0144 | 1.0072 | | 0.6475 | 9.3924 | 742 | 1.0187 | 0.6585 | 1.0187 | 1.0093 | | 0.6475 | 9.4177 | 744 | 1.0190 | 0.6585 | 1.0190 | 1.0095 | | 0.6475 | 9.4430 | 746 | 1.0164 | 0.6585 | 1.0164 | 1.0081 | | 0.6475 | 9.4684 | 748 | 1.0141 | 0.6585 | 1.0141 | 1.0070 | | 0.6475 | 9.4937 | 750 | 1.0125 | 0.6866 | 1.0125 | 1.0062 | | 0.6475 | 9.5190 | 752 | 1.0109 | 0.6866 | 1.0109 | 1.0054 | | 0.6475 | 9.5443 | 754 | 1.0086 | 0.7033 | 1.0086 | 1.0043 | | 0.6475 | 9.5696 | 756 | 1.0105 | 0.7033 | 1.0105 | 1.0052 | | 0.6475 | 9.5949 | 758 | 1.0108 | 0.7033 | 1.0108 | 1.0054 | | 0.6475 | 9.6203 | 760 | 1.0078 | 0.7033 | 1.0078 | 1.0039 | | 0.6475 | 9.6456 | 762 | 1.0012 | 0.7033 | 1.0012 | 1.0006 | | 0.6475 | 9.6709 | 764 | 0.9906 | 0.7033 | 0.9906 | 0.9953 | | 0.6475 | 9.6962 | 766 | 0.9791 | 0.7033 | 0.9791 | 0.9895 | | 0.6475 | 9.7215 | 768 | 0.9705 | 0.6541 | 0.9705 | 0.9852 | | 0.6475 | 9.7468 | 770 | 0.9676 | 0.6541 | 0.9676 | 0.9837 | | 0.6475 | 9.7722 | 772 | 0.9641 | 0.6495 | 0.9641 | 0.9819 | | 0.6475 | 9.7975 | 774 | 0.9594 | 0.6495 | 0.9594 | 0.9795 | | 0.6475 | 9.8228 | 776 | 0.9571 | 0.6495 | 0.9571 | 0.9783 | | 0.6475 | 9.8481 | 778 | 0.9558 | 0.6495 | 0.9558 | 0.9776 | | 0.6475 | 9.8734 | 780 | 0.9562 | 0.6495 | 0.9562 | 0.9779 | | 0.6475 | 9.8987 | 782 | 0.9575 | 0.6495 | 0.9575 | 0.9785 | | 0.6475 | 9.9241 | 784 | 0.9585 | 0.6495 | 0.9585 | 0.9790 | | 0.6475 | 9.9494 | 786 | 0.9591 | 0.6495 | 0.9591 | 0.9793 | | 0.6475 | 9.9747 | 788 | 0.9590 | 0.6495 | 0.9590 | 0.9793 | | 0.6475 | 10.0 | 790 | 0.9590 | 0.6495 | 0.9590 | 0.9793 | ### Framework versions - Transformers 4.44.2 - Pytorch 2.4.0+cu118 - Datasets 2.21.0 - Tokenizers 0.19.1