soda-clip-finetuned
This model was trained from scratch on the soda-clip-loader dataset.
It achieves the following results on the evaluation set:
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5.0
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
4.6533 |
0.15 |
100 |
4.5663 |
4.5243 |
0.29 |
200 |
4.4131 |
4.2506 |
0.44 |
300 |
3.9908 |
3.9692 |
0.59 |
400 |
3.8105 |
3.7576 |
0.74 |
500 |
3.6515 |
3.5935 |
0.88 |
600 |
3.4758 |
3.3874 |
1.03 |
700 |
3.3259 |
3.1691 |
1.18 |
800 |
3.1645 |
3.021 |
1.33 |
900 |
3.0139 |
2.9045 |
1.47 |
1000 |
2.9027 |
2.8391 |
1.62 |
1100 |
2.8245 |
2.7293 |
1.77 |
1200 |
2.6703 |
2.6177 |
1.92 |
1300 |
2.5465 |
2.3473 |
2.06 |
1400 |
2.5076 |
2.1463 |
2.21 |
1500 |
2.4233 |
2.0842 |
2.36 |
1600 |
2.3488 |
2.0204 |
2.51 |
1700 |
2.2738 |
2.0013 |
2.65 |
1800 |
2.2473 |
1.9325 |
2.8 |
1900 |
2.2017 |
1.9072 |
2.95 |
2000 |
2.1397 |
1.5792 |
3.1 |
2100 |
2.1203 |
1.3949 |
3.24 |
2200 |
2.0973 |
1.3664 |
3.39 |
2300 |
2.0737 |
1.3545 |
3.54 |
2400 |
2.0320 |
1.3144 |
3.69 |
2500 |
2.0143 |
1.2897 |
3.83 |
2600 |
1.9552 |
1.2706 |
3.98 |
2700 |
1.9497 |
0.9014 |
4.13 |
2800 |
1.9983 |
0.8365 |
4.28 |
2900 |
1.9960 |
0.8187 |
4.42 |
3000 |
1.9886 |
0.8001 |
4.57 |
3100 |
1.9709 |
0.7979 |
4.72 |
3200 |
1.9513 |
0.7698 |
4.87 |
3300 |
1.9564 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2