t5-small-codesearchnet-multilang-python-java
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3690
- Bleu: 0.0229
- Rouge1: 0.4217
- Rouge2: 0.3443
- Avg Length: 17.1574
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 10
- total_train_batch_size: 80
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge1 | Rouge2 | Avg Length |
---|---|---|---|---|---|---|---|
No log | 1.0 | 375 | 0.4444 | 0.0215 | 0.369 | 0.3113 | 18.0234 |
1.954 | 2.0 | 750 | 0.3842 | 0.0214 | 0.398 | 0.3329 | 16.9644 |
0.3621 | 3.0 | 1125 | 0.3631 | 0.0238 | 0.4089 | 0.3377 | 17.3726 |
0.3166 | 4.0 | 1500 | 0.3472 | 0.0214 | 0.4152 | 0.3413 | 16.956 |
0.3166 | 5.0 | 1875 | 0.3403 | 0.0238 | 0.4191 | 0.3437 | 17.198 |
0.2846 | 6.0 | 2250 | 0.3333 | 0.0247 | 0.414 | 0.3456 | 17.5016 |
0.2605 | 7.0 | 2625 | 0.3357 | 0.022 | 0.4215 | 0.346 | 17.023 |
0.2422 | 8.0 | 3000 | 0.3317 | 0.0287 | 0.4277 | 0.3522 | 17.658 |
0.2422 | 9.0 | 3375 | 0.3334 | 0.0252 | 0.4288 | 0.351 | 17.3284 |
0.2221 | 10.0 | 3750 | 0.3347 | 0.0225 | 0.4215 | 0.3435 | 17.1506 |
0.2073 | 11.0 | 4125 | 0.3392 | 0.0258 | 0.4234 | 0.3498 | 17.4554 |
0.194 | 12.0 | 4500 | 0.3445 | 0.0245 | 0.427 | 0.3488 | 17.2398 |
0.194 | 13.0 | 4875 | 0.3614 | 0.0253 | 0.4284 | 0.3489 | 17.3902 |
0.1789 | 14.0 | 5250 | 0.3556 | 0.0276 | 0.4261 | 0.3504 | 17.5164 |
0.1681 | 15.0 | 5625 | 0.3690 | 0.0229 | 0.4217 | 0.3443 | 17.1574 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.