metadata
library_name: transformers
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: outputs
results: []
outputs
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Accuracy: 0.8503
- Loss: 0.6328
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 512
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Accuracy | Validation Loss |
---|---|---|---|---|
2.9726 | 0.9953 | 53 | 0.1321 | 2.9325 |
2.5775 | 1.9906 | 106 | 0.2765 | 2.4443 |
2.0438 | 2.9859 | 159 | 0.4247 | 1.9166 |
1.5681 | 4.0 | 213 | 0.5003 | 1.5682 |
1.3107 | 4.9953 | 266 | 0.5581 | 1.3651 |
1.1594 | 5.9906 | 319 | 0.6131 | 1.1995 |
1.0232 | 6.9859 | 372 | 0.6575 | 1.0813 |
0.934 | 8.0 | 426 | 0.7081 | 0.9652 |
0.8727 | 8.9953 | 479 | 0.7333 | 0.8802 |
0.7644 | 9.9906 | 532 | 0.7378 | 0.8551 |
0.7007 | 10.9859 | 585 | 0.7663 | 0.7584 |
0.6585 | 12.0 | 639 | 0.7673 | 0.7550 |
0.59 | 12.9953 | 692 | 0.7847 | 0.7072 |
0.5775 | 13.9906 | 745 | 0.7860 | 0.7042 |
0.5487 | 14.9859 | 798 | 0.7981 | 0.6649 |
0.5296 | 16.0 | 852 | 0.7958 | 0.6387 |
0.4866 | 16.9953 | 905 | 0.8125 | 0.6029 |
0.4779 | 17.9906 | 958 | 0.7935 | 0.6498 |
0.4418 | 18.9859 | 1011 | 0.8128 | 0.6004 |
0.4334 | 20.0 | 1065 | 0.8165 | 0.5995 |
0.4097 | 20.9953 | 1118 | 0.8326 | 0.5508 |
0.3947 | 21.9906 | 1171 | 0.8315 | 0.5585 |
0.3521 | 22.9859 | 1224 | 0.8328 | 0.5513 |
0.3298 | 24.0 | 1278 | 0.8319 | 0.5810 |
0.3216 | 24.9953 | 1331 | 0.8358 | 0.5499 |
0.3086 | 25.9906 | 1384 | 0.8394 | 0.5383 |
0.2912 | 26.9859 | 1437 | 0.8349 | 0.5845 |
0.2801 | 28.0 | 1491 | 0.8423 | 0.5717 |
0.2677 | 28.9953 | 1544 | 0.8434 | 0.5563 |
0.263 | 29.9906 | 1597 | 0.8434 | 0.5684 |
0.244 | 30.9859 | 1650 | 0.8408 | 0.5900 |
0.2449 | 32.0 | 1704 | 0.8330 | 0.6121 |
0.2276 | 32.9953 | 1757 | 0.8428 | 0.5891 |
0.2407 | 33.9906 | 1810 | 0.8374 | 0.6033 |
0.1997 | 34.9859 | 1863 | 0.8459 | 0.5969 |
0.2081 | 36.0 | 1917 | 0.8451 | 0.5824 |
0.1936 | 36.9953 | 1970 | 0.8470 | 0.5834 |
0.1975 | 37.9906 | 2023 | 0.8446 | 0.6106 |
0.1938 | 38.9859 | 2076 | 0.8433 | 0.6166 |
0.1874 | 40.0 | 2130 | 0.8538 | 0.5823 |
0.184 | 40.9953 | 2183 | 0.8434 | 0.6395 |
0.1584 | 41.9906 | 2236 | 0.8542 | 0.6060 |
0.1608 | 42.9859 | 2289 | 0.8479 | 0.6289 |
0.1604 | 44.0 | 2343 | 0.8523 | 0.6105 |
0.1398 | 44.9953 | 2396 | 0.8502 | 0.6340 |
0.1487 | 45.9906 | 2449 | 0.8489 | 0.6414 |
0.137 | 46.9859 | 2502 | 0.8484 | 0.6285 |
0.1223 | 48.0 | 2556 | 0.8507 | 0.6331 |
0.1339 | 48.9953 | 2609 | 0.8492 | 0.6295 |
0.1368 | 49.7653 | 2650 | 0.8503 | 0.6328 |
Framework versions
- Transformers 4.46.0.dev0
- Pytorch 2.4.0+cu121
- Datasets 2.20.0
- Tokenizers 0.20.1