Edit model card

t5-small_6_3-hi_en-to-en

This model was trained from scratch on the cmu_hinglish_dog dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3662
  • Bleu: 18.0863
  • Gen Len: 15.2708

Model description

Model generated using:
python make_student.py t5-small t5_small_6_3 6 3
Check this link for more information.

Intended uses & limitations

More information needed

Training and evaluation data

Used cmu_hinglish_dog dataset. Please check this link for dataset description

Translation:

  • Source: hi_en: The text in Hinglish
  • Target: en: The text in English

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 126 3.0601 4.7146 11.9904
No log 2.0 252 2.8885 5.9584 12.3418
No log 3.0 378 2.7914 6.649 12.3758
3.4671 4.0 504 2.7347 7.3305 12.3854
3.4671 5.0 630 2.6832 8.3132 12.4268
3.4671 6.0 756 2.6485 8.339 12.3641
3.4671 7.0 882 2.6096 8.7269 12.414
3.0208 8.0 1008 2.5814 9.2163 12.2675
3.0208 9.0 1134 2.5542 9.448 12.3875
3.0208 10.0 1260 2.5339 9.9011 12.4321
3.0208 11.0 1386 2.5043 9.7529 12.5149
2.834 12.0 1512 2.4848 9.9606 12.4193
2.834 13.0 1638 2.4737 9.9368 12.3673
2.834 14.0 1764 2.4458 10.3182 12.4352
2.834 15.0 1890 2.4332 10.486 12.4671
2.7065 16.0 2016 2.4239 10.6921 12.414
2.7065 17.0 2142 2.4064 10.7426 12.4607
2.7065 18.0 2268 2.3941 11.0509 12.4087
2.7065 19.0 2394 2.3826 11.2407 12.3386
2.603 20.0 2520 2.3658 11.3711 12.3992
2.603 21.0 2646 2.3537 11.42 12.5032
2.603 22.0 2772 2.3475 12.0665 12.5074
2.603 23.0 2898 2.3398 12.0343 12.4342
2.5192 24.0 3024 2.3298 12.1011 12.5096
2.5192 25.0 3150 2.3216 12.2562 12.4809
2.5192 26.0 3276 2.3131 12.4585 12.4427
2.5192 27.0 3402 2.3052 12.7094 12.534
2.4445 28.0 3528 2.2984 12.7432 12.5053
2.4445 29.0 3654 2.2920 12.8409 12.4501
2.4445 30.0 3780 2.2869 12.6365 12.4936
2.4445 31.0 3906 2.2777 12.8523 12.5234
2.3844 32.0 4032 2.2788 12.9216 12.4204
2.3844 33.0 4158 2.2710 12.9568 12.5064
2.3844 34.0 4284 2.2643 12.9641 12.4299
2.3844 35.0 4410 2.2621 12.9787 12.448
2.3282 36.0 4536 2.2554 13.1264 12.4374
2.3282 37.0 4662 2.2481 13.1853 12.4416
2.3282 38.0 4788 2.2477 13.3259 12.4119
2.3282 39.0 4914 2.2448 13.2017 12.4278
2.2842 40.0 5040 2.2402 13.3772 12.4437
2.2842 41.0 5166 2.2373 13.2184 12.414
2.2842 42.0 5292 2.2357 13.5267 12.4342
2.2842 43.0 5418 2.2310 13.5754 12.4087
2.2388 44.0 5544 2.2244 13.653 12.4427
2.2388 45.0 5670 2.2243 13.6028 12.431
2.2388 46.0 5796 2.2216 13.7128 12.4151
2.2388 47.0 5922 2.2231 13.749 12.4172
2.2067 48.0 6048 2.2196 13.7256 12.4034
2.2067 49.0 6174 2.2125 13.8237 12.396
2.2067 50.0 6300 2.2131 13.6642 12.4416
2.2067 51.0 6426 2.2115 13.8876 12.4119
2.1688 52.0 6552 2.2091 14.0323 12.4639
2.1688 53.0 6678 2.2082 13.916 12.3843
2.1688 54.0 6804 2.2071 13.924 12.3758
2.1688 55.0 6930 2.2046 13.9563 12.4416
2.1401 56.0 7056 2.2020 14.0592 12.483
2.1401 57.0 7182 2.2047 13.8879 12.4076
2.1401 58.0 7308 2.2018 13.9267 12.3949
2.1401 59.0 7434 2.1964 14.0518 12.4363
2.1092 60.0 7560 2.1926 14.1518 12.4883
2.1092 61.0 7686 2.1972 14.132 12.4034
2.1092 62.0 7812 2.1939 14.2066 12.4151
2.1092 63.0 7938 2.1905 14.2923 12.4459
2.0932 64.0 8064 2.1932 14.2476 12.3418
2.0932 65.0 8190 2.1925 14.2057 12.3907
2.0932 66.0 8316 2.1906 14.2978 12.4055
2.0932 67.0 8442 2.1903 14.3276 12.4427
2.0706 68.0 8568 2.1918 14.4681 12.4034
2.0706 69.0 8694 2.1882 14.3751 12.4225
2.0706 70.0 8820 2.1870 14.5904 12.4204
2.0706 71.0 8946 2.1865 14.6409 12.4512
2.0517 72.0 9072 2.1831 14.6505 12.4352
2.0517 73.0 9198 2.1835 14.7485 12.4363
2.0517 74.0 9324 2.1824 14.7344 12.4586
2.0517 75.0 9450 2.1829 14.8097 12.4575
2.0388 76.0 9576 2.1822 14.6681 12.4108
2.0388 77.0 9702 2.1823 14.6421 12.4342
2.0388 78.0 9828 2.1816 14.7014 12.4459
2.0388 79.0 9954 2.1810 14.744 12.4565
2.0224 80.0 10080 2.1839 14.7889 12.4437
2.0224 81.0 10206 2.1793 14.802 12.4565
2.0224 82.0 10332 2.1776 14.7702 12.4214
2.0224 83.0 10458 2.1809 14.6772 12.4236
2.0115 84.0 10584 2.1786 14.709 12.4214
2.0115 85.0 10710 2.1805 14.7693 12.3981
2.0115 86.0 10836 2.1790 14.7628 12.4172
2.0115 87.0 10962 2.1785 14.7538 12.3992
2.0007 88.0 11088 2.1788 14.7493 12.3726
2.0007 89.0 11214 2.1788 14.8793 12.4045
2.0007 90.0 11340 2.1786 14.8318 12.3747
2.0007 91.0 11466 2.1769 14.8061 12.4013
1.9967 92.0 11592 2.1757 14.8108 12.3843
1.9967 93.0 11718 2.1747 14.8036 12.379
1.9967 94.0 11844 2.1764 14.7447 12.3737
1.9967 95.0 11970 2.1759 14.7759 12.3875
1.9924 96.0 12096 2.1760 14.7695 12.3875
1.9924 97.0 12222 2.1762 14.8022 12.3769
1.9924 98.0 12348 2.1763 14.7519 12.3822
1.9924 99.0 12474 2.1760 14.7756 12.3832
1.9903 100.0 12600 2.1761 14.7713 12.3822

Evaluation results

Data Split Bleu
Validation 17.8061
Test 18.0863

Framework versions

  • Transformers 4.20.0.dev0
  • Pytorch 1.8.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train sayanmandal/t5-small_6_3-hi_en-to-en

Evaluation results