Vit-GPT2-COCO2017Flickr-01
This model is a fine-tuned version of NourFakih/image-captioning-Vit-GPT2-Flickr8k on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2789
- Rouge1: 40.4777
- Rouge2: 15.156
- Rougel: 36.8755
- Rougelsum: 36.8813
- Gen Len: 11.92
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|---|
0.2185 | 0.08 | 500 | 11.9627 | 0.2288 | 41.2368 | 15.6218 | 37.5796 | 37.5754 |
0.2097 | 0.15 | 1000 | 12.1819 | 0.2266 | 41.0126 | 15.773 | 37.2736 | 37.2843 |
0.2067 | 0.23 | 1500 | 11.1865 | 0.2260 | 41.0707 | 15.534 | 37.4934 | 37.5044 |
0.1997 | 0.31 | 2000 | 11.4404 | 0.2251 | 41.5488 | 15.8208 | 37.704 | 37.7153 |
0.1962 | 0.38 | 2500 | 12.1219 | 0.2241 | 41.6067 | 16.1235 | 37.8372 | 37.8403 |
0.1891 | 0.46 | 3000 | 12.0462 | 0.2246 | 41.7488 | 16.5323 | 38.0498 | 38.0689 |
0.1942 | 0.54 | 3500 | 11.8842 | 0.2252 | 41.3542 | 15.7955 | 37.8567 | 37.8759 |
0.186 | 0.62 | 4000 | 11.6954 | 0.2256 | 41.4582 | 15.8671 | 37.7381 | 37.7557 |
0.1822 | 0.69 | 4500 | 11.6962 | 0.2253 | 41.6779 | 15.8426 | 37.9166 | 37.9538 |
0.1829 | 0.77 | 5000 | 11.695 | 0.2248 | 41.8987 | 16.4174 | 38.3064 | 38.321 |
0.1786 | 0.85 | 5500 | 11.9762 | 0.2251 | 40.9742 | 15.6616 | 37.3227 | 37.3401 |
0.1808 | 0.92 | 6000 | 11.7042 | 0.2260 | 41.5023 | 16.0289 | 37.9925 | 37.9843 |
0.1758 | 1.0 | 6500 | 11.8888 | 0.2262 | 41.3528 | 16.0559 | 37.8786 | 37.8588 |
0.1326 | 1.08 | 7000 | 11.8173 | 0.2394 | 40.7818 | 15.486 | 37.2677 | 37.2794 |
0.1291 | 1.15 | 7500 | 11.7969 | 0.2412 | 41.4117 | 16.2382 | 37.9863 | 37.9964 |
0.1314 | 1.23 | 8000 | 11.7969 | 0.2436 | 41.1586 | 15.5594 | 37.512 | 37.5293 |
0.131 | 1.31 | 8500 | 11.8281 | 0.2427 | 41.1027 | 15.817 | 37.7167 | 37.7216 |
0.1322 | 1.38 | 9000 | 11.8927 | 0.2400 | 41.4453 | 16.0873 | 37.7242 | 37.735 |
0.1237 | 1.46 | 9500 | 11.8035 | 0.2447 | 40.704 | 15.0054 | 37.1021 | 37.1102 |
0.1289 | 1.54 | 10000 | 12.2473 | 0.2441 | 41.0159 | 15.5793 | 37.1366 | 37.1673 |
0.1236 | 1.62 | 10500 | 11.6977 | 0.2452 | 40.8137 | 15.3874 | 37.1591 | 37.1672 |
0.1241 | 1.69 | 11000 | 11.4181 | 0.2465 | 40.9985 | 15.3879 | 37.1388 | 37.1634 |
0.1219 | 1.77 | 11500 | 11.7765 | 0.2463 | 41.1345 | 15.6654 | 37.3921 | 37.4082 |
0.1234 | 1.85 | 12000 | 12.1512 | 0.2444 | 41.134 | 15.7004 | 37.3621 | 37.3993 |
0.1193 | 1.92 | 12500 | 11.6831 | 0.2466 | 40.568 | 15.1806 | 37.0715 | 37.0779 |
0.1148 | 2.0 | 13000 | 11.6546 | 0.2482 | 41.0991 | 15.4567 | 37.4898 | 37.5136 |
0.0836 | 2.08 | 13500 | 12.0708 | 0.2717 | 40.4842 | 15.0195 | 36.8428 | 36.859 |
0.0869 | 2.15 | 14000 | 12.0069 | 0.2731 | 40.6828 | 14.8559 | 36.8299 | 36.8515 |
0.0846 | 2.23 | 14500 | 12.02 | 0.2727 | 40.1785 | 14.8884 | 36.7155 | 36.7025 |
0.0829 | 2.31 | 15000 | 12.0535 | 0.2756 | 40.9047 | 15.2085 | 37.1447 | 37.1153 |
0.0855 | 2.38 | 15500 | 12.0346 | 0.2757 | 40.8628 | 14.9646 | 37.068 | 37.0583 |
0.0859 | 2.46 | 16000 | 11.8796 | 0.2762 | 40.924 | 15.2223 | 37.1443 | 37.1329 |
0.0847 | 2.54 | 16500 | 11.9292 | 0.2786 | 40.9447 | 15.2269 | 37.1398 | 37.1511 |
0.0831 | 2.62 | 17000 | 12.0958 | 0.2770 | 40.417 | 14.7542 | 36.6568 | 36.6345 |
0.0828 | 2.69 | 17500 | 11.845 | 0.2796 | 40.7295 | 15.0389 | 36.9957 | 36.9706 |
0.0782 | 2.77 | 18000 | 11.9369 | 0.2796 | 40.7406 | 15.1238 | 36.9906 | 36.9817 |
0.0798 | 2.85 | 18500 | 11.9869 | 0.2792 | 40.4692 | 15.0458 | 36.8005 | 36.7953 |
0.0794 | 2.92 | 19000 | 11.8985 | 0.2792 | 40.497 | 15.1883 | 36.8923 | 36.8945 |
0.0793 | 3.0 | 19500 | 11.92 | 0.2789 | 40.4777 | 15.156 | 36.8755 | 36.8813 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 68
Inference API (serverless) does not yet support transformers models for this pipeline type.
Model tree for NourFakih/Vit-GPT2-COCO2017Flickr-01
Base model
nlpconnect/vit-gpt2-image-captioning