Vit-GPT2-COCO2017Flickr-115k-12
This model is a fine-tuned version of NourFakih/Vit-GPT2-COCO2017Flickr-115k-12 on an unknown dataset. It achieves the following results on the evaluation set:
- Gen Len: 11.9339
- Loss: 0.5262
- Rouge1: 40.547
- Rouge2: 14.9559
- Rougel: 36.6474
- Rougelsum: 36.655
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|---|
0.3804 | 0.0696 | 500 | 11.8970 | 0.4475 | 41.2997 | 15.8517 | 37.5088 | 37.5135 |
0.383 | 0.1391 | 1000 | 11.4465 | 0.4486 | 41.0786 | 15.592 | 37.2557 | 37.247 |
0.3804 | 0.2087 | 1500 | 11.7144 | 0.4462 | 41.0521 | 15.5552 | 37.3142 | 37.3142 |
0.376 | 0.2783 | 2000 | 11.7237 | 0.4503 | 41.0712 | 15.4215 | 37.1593 | 37.1483 |
0.3742 | 0.3478 | 2500 | 12.056 | 0.4424 | 41.0197 | 15.5533 | 37.0838 | 37.0828 |
0.3702 | 0.4174 | 3000 | 11.51 | 0.4476 | 41.3443 | 15.849 | 37.5657 | 37.573 |
0.3682 | 0.4870 | 3500 | 11.9582 | 0.4470 | 41.5477 | 16.1138 | 37.5701 | 37.5784 |
0.367 | 0.5565 | 4000 | 11.3945 | 0.4440 | 41.244 | 15.7945 | 37.5079 | 37.5147 |
0.3609 | 0.6261 | 4500 | 11.8043 | 0.4479 | 41.0531 | 15.81 | 37.3273 | 37.3317 |
0.3638 | 0.6957 | 5000 | 11.6248 | 0.4414 | 41.3129 | 15.94 | 37.5627 | 37.583 |
0.3543 | 0.7652 | 5500 | 11.5961 | 0.4482 | 41.1584 | 15.7318 | 37.4062 | 37.4104 |
0.3521 | 0.8348 | 6000 | 11.7494 | 0.4445 | 41.3954 | 15.8539 | 37.6111 | 37.6208 |
0.3537 | 0.9043 | 6500 | 11.8193 | 0.4438 | 41.6655 | 16.1228 | 37.7752 | 37.7812 |
0.3459 | 0.9739 | 7000 | 11.614 | 0.4427 | 41.4156 | 15.8963 | 37.5839 | 37.5837 |
0.3092 | 1.0435 | 7500 | 11.9294 | 0.4620 | 41.2316 | 15.7114 | 37.3554 | 37.3522 |
0.2924 | 1.1130 | 8000 | 11.8922 | 0.4626 | 40.8321 | 15.5115 | 37.0416 | 37.0319 |
0.2886 | 1.1826 | 8500 | 11.8237 | 0.4692 | 40.9025 | 15.347 | 37.0047 | 36.9942 |
0.2901 | 1.2522 | 9000 | 11.8817 | 0.4649 | 41.0739 | 15.3744 | 37.1782 | 37.1775 |
0.2868 | 1.3217 | 9500 | 11.8390 | 0.4668 | 41.2378 | 15.6205 | 37.3887 | 37.3857 |
0.2825 | 1.3913 | 10000 | 11.6785 | 0.4680 | 41.0405 | 15.5335 | 37.1249 | 37.1293 |
0.2843 | 1.4609 | 10500 | 11.792 | 0.4753 | 40.7387 | 15.0593 | 36.8401 | 36.8406 |
0.2819 | 1.5305 | 11000 | 11.7928 | 0.4718 | 41.1479 | 15.5299 | 37.2971 | 37.2975 |
0.2791 | 1.6001 | 11500 | 11.7897 | 0.4728 | 40.8974 | 15.2068 | 37.0748 | 37.0794 |
0.2756 | 1.6696 | 12000 | 11.7343 | 0.4776 | 40.9051 | 15.3769 | 37.1201 | 37.122 |
0.275 | 1.7392 | 12500 | 11.6749 | 0.4799 | 41.0987 | 15.4789 | 37.1856 | 37.1813 |
0.2703 | 1.8088 | 13000 | 11.6395 | 0.4787 | 41.211 | 15.6686 | 37.34 | 37.3342 |
0.2733 | 1.8783 | 13500 | 11.7356 | 0.4808 | 41.1137 | 15.5692 | 37.2625 | 37.2592 |
0.27 | 1.9479 | 14000 | 11.7816 | 0.4818 | 41.4823 | 15.8925 | 37.6161 | 37.6165 |
0.255 | 2.0175 | 14500 | 11.6703 | 0.5038 | 40.9089 | 15.2776 | 36.9295 | 36.9275 |
0.2292 | 2.0871 | 15000 | 11.8069 | 0.5076 | 41.0044 | 15.3659 | 37.0441 | 37.0438 |
0.2226 | 2.1567 | 15500 | 11.8068 | 0.5129 | 40.8428 | 15.2553 | 36.9787 | 36.9642 |
0.2236 | 2.2262 | 16000 | 11.8470 | 0.5153 | 40.6711 | 15.13 | 36.8865 | 36.8805 |
0.2215 | 2.2958 | 16500 | 11.8765 | 0.5200 | 40.7621 | 15.0155 | 36.8522 | 36.85 |
0.2215 | 2.3654 | 17000 | 11.7364 | 0.5186 | 40.6314 | 15.1297 | 36.8185 | 36.8123 |
0.2202 | 2.4349 | 17500 | 11.9915 | 0.5208 | 40.6588 | 15.0601 | 36.5692 | 36.5715 |
0.2126 | 2.5045 | 18000 | 11.8697 | 0.5218 | 40.6073 | 14.8542 | 36.6365 | 36.6408 |
0.2178 | 2.5741 | 18500 | 11.9752 | 0.5221 | 40.2978 | 14.671 | 36.4154 | 36.4278 |
0.2179 | 2.6436 | 19000 | 11.9277 | 0.5201 | 40.5832 | 14.9498 | 36.6925 | 36.7034 |
0.2165 | 2.7132 | 19500 | 11.9769 | 0.5252 | 40.5697 | 15.0438 | 36.7188 | 36.7223 |
0.2134 | 2.7827 | 20000 | 11.938 | 0.5269 | 40.6224 | 15.0783 | 36.6861 | 36.6882 |
0.2157 | 2.8523 | 20500 | 11.8804 | 0.5289 | 40.4897 | 14.9113 | 36.6069 | 36.6136 |
0.2142 | 2.9219 | 21000 | 11.8809 | 0.5260 | 40.4523 | 14.952 | 36.6097 | 36.616 |
0.2124 | 2.9914 | 21500 | 11.9339 | 0.5262 | 40.547 | 14.9559 | 36.6474 | 36.655 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.1.2
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 86
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for NourFakih/Vit-GPT2-COCO2017Flickr-115k-12
Unable to build the model tree, the base model loops to the model itself. Learn more.