cs_m2m_0.00001_200_v0.2
This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 8.4603
- Bleu: 0.1346
- Gen Len: 69.619
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
2.684 | 1.0 | 6 | 8.4517 | 0.0956 | 61.6667 |
1.978 | 2.0 | 12 | 8.4546 | 0.0985 | 61.8095 |
2.8654 | 3.0 | 18 | 8.4538 | 0.0961 | 62.4286 |
2.8165 | 4.0 | 24 | 8.4550 | 0.0991 | 63.1905 |
2.6606 | 5.0 | 30 | 8.4556 | 0.0956 | 61.0476 |
3.1159 | 6.0 | 36 | 8.4525 | 0.0964 | 60.5238 |
1.813 | 7.0 | 42 | 8.4524 | 0.0961 | 59.8095 |
2.9637 | 8.0 | 48 | 8.4520 | 0.0961 | 59.8095 |
2.1663 | 9.0 | 54 | 8.4526 | 0.0918 | 59.5714 |
2.475 | 10.0 | 60 | 8.4516 | 0.0916 | 59.381 |
2.5769 | 11.0 | 66 | 8.4493 | 0.0927 | 60.1905 |
2.414 | 12.0 | 72 | 8.4485 | 0.0927 | 60.1905 |
2.5985 | 13.0 | 78 | 8.4500 | 0.0946 | 60.1905 |
2.6263 | 14.0 | 84 | 8.4527 | 0.1003 | 61.0 |
2.2439 | 15.0 | 90 | 8.4533 | 0.0774 | 69.0952 |
1.9865 | 16.0 | 96 | 8.4542 | 0.0769 | 69.5238 |
2.2472 | 17.0 | 102 | 8.4540 | 0.0766 | 69.7619 |
2.5489 | 18.0 | 108 | 8.4534 | 0.0782 | 70.3333 |
1.9181 | 19.0 | 114 | 8.4527 | 0.0789 | 70.5714 |
2.0332 | 20.0 | 120 | 8.4505 | 0.0785 | 70.7619 |
1.9397 | 21.0 | 126 | 8.4488 | 0.0784 | 70.9048 |
2.788 | 22.0 | 132 | 8.4480 | 0.0772 | 71.9524 |
2.4842 | 23.0 | 138 | 8.4473 | 0.0778 | 71.6667 |
2.3397 | 24.0 | 144 | 8.4459 | 0.0975 | 62.6667 |
2.3303 | 25.0 | 150 | 8.4448 | 0.1314 | 71.9048 |
2.6417 | 26.0 | 156 | 8.4436 | 0.1311 | 71.9524 |
2.0759 | 27.0 | 162 | 8.4446 | 0.128 | 71.9524 |
2.0973 | 28.0 | 168 | 8.4450 | 0.1659 | 62.1905 |
2.9593 | 29.0 | 174 | 8.4455 | 0.1285 | 71.4762 |
3.0086 | 30.0 | 180 | 8.4442 | 0.1624 | 61.8571 |
2.684 | 31.0 | 186 | 8.4431 | 0.162 | 62.0952 |
2.7015 | 32.0 | 192 | 8.4442 | 0.162 | 62.0952 |
4.6745 | 33.0 | 198 | 8.4431 | 0.1624 | 62.9048 |
2.1913 | 34.0 | 204 | 8.4427 | 0.1607 | 63.0 |
2.1685 | 35.0 | 210 | 8.4443 | 0.1671 | 61.4286 |
2.3458 | 36.0 | 216 | 8.4458 | 0.1346 | 69.6667 |
2.0533 | 37.0 | 222 | 8.4456 | 0.132 | 70.1905 |
3.1101 | 38.0 | 228 | 8.4442 | 0.1335 | 69.8095 |
2.2737 | 39.0 | 234 | 8.4447 | 0.0787 | 70.7619 |
2.4838 | 40.0 | 240 | 8.4476 | 0.0784 | 70.1905 |
1.9048 | 41.0 | 246 | 8.4487 | 0.0801 | 70.4762 |
2.825 | 42.0 | 252 | 8.4495 | 0.0668 | 79.4286 |
1.7811 | 43.0 | 258 | 8.4521 | 0.0639 | 78.2381 |
2.1382 | 44.0 | 264 | 8.4545 | 0.0639 | 78.1429 |
2.2783 | 45.0 | 270 | 8.4553 | 0.0636 | 78.5714 |
2.1117 | 46.0 | 276 | 8.4558 | 0.0636 | 78.5714 |
2.0165 | 47.0 | 282 | 8.4563 | 0.0638 | 78.4762 |
2.2424 | 48.0 | 288 | 8.4568 | 0.0639 | 78.3333 |
2.7404 | 49.0 | 294 | 8.4564 | 0.0627 | 79.5714 |
3.3443 | 50.0 | 300 | 8.4560 | 0.0617 | 78.4762 |
2.7281 | 51.0 | 306 | 8.4551 | 0.0617 | 78.4762 |
2.9189 | 52.0 | 312 | 8.4520 | 0.0757 | 70.7143 |
2.3192 | 53.0 | 318 | 8.4512 | 0.0754 | 70.7619 |
2.3737 | 54.0 | 324 | 8.4505 | 0.0604 | 78.4286 |
2.4041 | 55.0 | 330 | 8.4490 | 0.0606 | 78.0952 |
4.5412 | 56.0 | 336 | 8.4478 | 0.0618 | 78.0952 |
2.399 | 57.0 | 342 | 8.4469 | 0.0617 | 78.2381 |
1.8226 | 58.0 | 348 | 8.4467 | 0.062 | 77.9048 |
2.3362 | 59.0 | 354 | 8.4463 | 0.0612 | 77.4762 |
2.4263 | 60.0 | 360 | 8.4450 | 0.0612 | 77.4762 |
2.7929 | 61.0 | 366 | 8.4439 | 0.0617 | 78.2381 |
3.2633 | 62.0 | 372 | 8.4434 | 0.0615 | 78.3333 |
2.3451 | 63.0 | 378 | 8.4436 | 0.0607 | 77.9048 |
2.8337 | 64.0 | 384 | 8.4429 | 0.061 | 77.4762 |
2.7405 | 65.0 | 390 | 8.4430 | 0.0607 | 77.9048 |
2.8955 | 66.0 | 396 | 8.4420 | 0.0614 | 78.6667 |
2.3475 | 67.0 | 402 | 8.4408 | 0.061 | 79.0952 |
2.0904 | 68.0 | 408 | 8.4383 | 0.0608 | 79.1905 |
2.4816 | 69.0 | 414 | 8.4367 | 0.0607 | 79.3333 |
2.3696 | 70.0 | 420 | 8.4365 | 0.0607 | 79.3333 |
2.7587 | 71.0 | 426 | 8.4364 | 0.0616 | 79.5714 |
2.0684 | 72.0 | 432 | 8.4369 | 0.0617 | 79.4762 |
2.5021 | 73.0 | 438 | 8.4375 | 0.0617 | 79.4762 |
1.4037 | 74.0 | 444 | 8.4362 | 0.0759 | 71.0476 |
2.1197 | 75.0 | 450 | 8.4357 | 0.0763 | 70.7619 |
2.2019 | 76.0 | 456 | 8.4378 | 0.0612 | 78.8571 |
1.8674 | 77.0 | 462 | 8.4402 | 0.062 | 77.7619 |
4.6628 | 78.0 | 468 | 8.4415 | 0.0769 | 69.3333 |
2.5704 | 79.0 | 474 | 8.4420 | 0.0769 | 69.3333 |
1.8771 | 80.0 | 480 | 8.4422 | 0.0772 | 69.1905 |
1.9444 | 81.0 | 486 | 8.4437 | 0.078 | 70.5238 |
2.0133 | 82.0 | 492 | 8.4443 | 0.0771 | 71.1429 |
2.8815 | 83.0 | 498 | 8.4445 | 0.0757 | 70.4286 |
3.0573 | 84.0 | 504 | 8.4455 | 0.0621 | 77.7143 |
2.011 | 85.0 | 510 | 8.4469 | 0.0621 | 77.7143 |
1.8176 | 86.0 | 516 | 8.4488 | 0.0621 | 77.7143 |
1.505 | 87.0 | 522 | 8.4512 | 0.0621 | 77.7143 |
5.016 | 88.0 | 528 | 8.4542 | 0.0622 | 77.5714 |
4.8956 | 89.0 | 534 | 8.4565 | 0.0625 | 77.1905 |
2.3939 | 90.0 | 540 | 8.4578 | 0.0625 | 77.1905 |
1.8629 | 91.0 | 546 | 8.4589 | 0.0622 | 77.5714 |
2.7315 | 92.0 | 552 | 8.4599 | 0.0617 | 78.1429 |
2.6185 | 93.0 | 558 | 8.4605 | 0.0618 | 78.1429 |
2.2754 | 94.0 | 564 | 8.4598 | 0.0617 | 78.2381 |
1.9322 | 95.0 | 570 | 8.4582 | 0.0616 | 78.381 |
2.1725 | 96.0 | 576 | 8.4583 | 0.0621 | 78.9524 |
2.603 | 97.0 | 582 | 8.4576 | 0.0619 | 79.1905 |
2.543 | 98.0 | 588 | 8.4569 | 0.0619 | 79.1905 |
2.4981 | 99.0 | 594 | 8.4563 | 0.0618 | 79.2857 |
1.8449 | 100.0 | 600 | 8.4561 | 0.063 | 80.0952 |
3.063 | 101.0 | 606 | 8.4559 | 0.0618 | 79.2857 |
1.7031 | 102.0 | 612 | 8.4564 | 0.0622 | 77.7143 |
2.6749 | 103.0 | 618 | 8.4563 | 0.0623 | 77.5714 |
2.5504 | 104.0 | 624 | 8.4558 | 0.0781 | 69.4286 |
1.785 | 105.0 | 630 | 8.4559 | 0.0791 | 69.4286 |
2.3876 | 106.0 | 636 | 8.4560 | 0.0753 | 70.5238 |
1.9649 | 107.0 | 642 | 8.4556 | 0.0613 | 78.4762 |
2.5544 | 108.0 | 648 | 8.4571 | 0.0617 | 78.3333 |
2.3048 | 109.0 | 654 | 8.4578 | 0.0619 | 77.9524 |
3.2234 | 110.0 | 660 | 8.4595 | 0.0618 | 77.9524 |
2.5271 | 111.0 | 666 | 8.4600 | 0.0619 | 77.7619 |
2.1592 | 112.0 | 672 | 8.4599 | 0.0621 | 77.8571 |
2.1582 | 113.0 | 678 | 8.4600 | 0.0618 | 77.9524 |
5.1356 | 114.0 | 684 | 8.4596 | 0.0622 | 77.6667 |
3.1661 | 115.0 | 690 | 8.4594 | 0.0622 | 77.7619 |
2.1159 | 116.0 | 696 | 8.4597 | 0.0617 | 78.2381 |
2.1355 | 117.0 | 702 | 8.4602 | 0.0612 | 78.7143 |
2.5071 | 118.0 | 708 | 8.4606 | 0.0631 | 79.9524 |
2.5419 | 119.0 | 714 | 8.4608 | 0.0631 | 80.0476 |
2.1749 | 120.0 | 720 | 8.4616 | 0.0617 | 79.381 |
2.1737 | 121.0 | 726 | 8.4622 | 0.0631 | 80.0476 |
2.2413 | 122.0 | 732 | 8.4623 | 0.0633 | 79.8095 |
2.2636 | 123.0 | 738 | 8.4624 | 0.0636 | 79.4762 |
2.9731 | 124.0 | 744 | 8.4624 | 0.0636 | 79.4762 |
2.6207 | 125.0 | 750 | 8.4621 | 0.0636 | 79.4762 |
2.6231 | 126.0 | 756 | 8.4602 | 0.0636 | 79.4762 |
2.4161 | 127.0 | 762 | 8.4605 | 0.0637 | 79.381 |
2.9764 | 128.0 | 768 | 8.4613 | 0.0762 | 70.9524 |
2.41 | 129.0 | 774 | 8.4618 | 0.0761 | 71.0476 |
2.1357 | 130.0 | 780 | 8.4620 | 0.0762 | 70.7143 |
3.211 | 131.0 | 786 | 8.4621 | 0.0762 | 70.7143 |
1.8992 | 132.0 | 792 | 8.4623 | 0.0633 | 79.7143 |
2.9689 | 133.0 | 798 | 8.4621 | 0.0631 | 79.9524 |
2.4456 | 134.0 | 804 | 8.4619 | 0.0629 | 80.0476 |
1.9567 | 135.0 | 810 | 8.4620 | 0.063 | 79.8571 |
4.3724 | 136.0 | 816 | 8.4619 | 0.0626 | 79.2381 |
2.2729 | 137.0 | 822 | 8.4623 | 0.0626 | 79.2381 |
2.2375 | 138.0 | 828 | 8.4620 | 0.0625 | 78.2381 |
2.0507 | 139.0 | 834 | 8.4617 | 0.0625 | 78.2381 |
3.2081 | 140.0 | 840 | 8.4621 | 0.1072 | 78.0952 |
3.0478 | 141.0 | 846 | 8.4629 | 0.1072 | 78.0952 |
1.6707 | 142.0 | 852 | 8.4628 | 0.1042 | 77.5238 |
2.7035 | 143.0 | 858 | 8.4626 | 0.1042 | 77.5238 |
2.0088 | 144.0 | 864 | 8.4627 | 0.1042 | 77.5238 |
2.2061 | 145.0 | 870 | 8.4619 | 0.1042 | 77.5238 |
2.9719 | 146.0 | 876 | 8.4597 | 0.1055 | 76.7143 |
1.7429 | 147.0 | 882 | 8.4591 | 0.1335 | 69.0952 |
2.0689 | 148.0 | 888 | 8.4590 | 0.1094 | 77.7143 |
3.0878 | 149.0 | 894 | 8.4593 | 0.1094 | 77.7143 |
2.3762 | 150.0 | 900 | 8.4593 | 0.1083 | 78.381 |
1.9409 | 151.0 | 906 | 8.4591 | 0.1083 | 78.381 |
2.472 | 152.0 | 912 | 8.4590 | 0.1328 | 70.1905 |
2.1888 | 153.0 | 918 | 8.4590 | 0.1341 | 69.619 |
2.8783 | 154.0 | 924 | 8.4582 | 0.1341 | 69.619 |
2.4719 | 155.0 | 930 | 8.4582 | 0.1318 | 68.9524 |
2.4873 | 156.0 | 936 | 8.4579 | 0.1318 | 68.9524 |
2.202 | 157.0 | 942 | 8.4576 | 0.1318 | 68.9524 |
2.4128 | 158.0 | 948 | 8.4577 | 0.1318 | 68.9524 |
1.6922 | 159.0 | 954 | 8.4577 | 0.1318 | 68.9524 |
2.5719 | 160.0 | 960 | 8.4582 | 0.1318 | 68.9524 |
1.8392 | 161.0 | 966 | 8.4581 | 0.1318 | 68.9524 |
2.1349 | 162.0 | 972 | 8.4581 | 0.1318 | 68.9524 |
2.0836 | 163.0 | 978 | 8.4586 | 0.1318 | 68.9524 |
2.5173 | 164.0 | 984 | 8.4590 | 0.1318 | 68.9524 |
1.9422 | 165.0 | 990 | 8.4591 | 0.1318 | 68.9524 |
2.4949 | 166.0 | 996 | 8.4591 | 0.1318 | 68.9524 |
2.6692 | 167.0 | 1002 | 8.4586 | 0.1318 | 68.9524 |
1.5472 | 168.0 | 1008 | 8.4588 | 0.1318 | 68.9524 |
5.0693 | 169.0 | 1014 | 8.4589 | 0.1318 | 68.9524 |
2.6937 | 170.0 | 1020 | 8.4593 | 0.1318 | 68.9524 |
5.0729 | 171.0 | 1026 | 8.4596 | 0.1306 | 69.5238 |
2.645 | 172.0 | 1032 | 8.4599 | 0.1306 | 69.5238 |
1.671 | 173.0 | 1038 | 8.4600 | 0.1306 | 69.5238 |
2.329 | 174.0 | 1044 | 8.4600 | 0.1306 | 69.5238 |
2.2443 | 175.0 | 1050 | 8.4597 | 0.1306 | 69.5238 |
2.0599 | 176.0 | 1056 | 8.4594 | 0.1306 | 69.5238 |
2.0761 | 177.0 | 1062 | 8.4598 | 0.1639 | 60.7619 |
2.3301 | 178.0 | 1068 | 8.4595 | 0.1306 | 69.5238 |
2.8817 | 179.0 | 1074 | 8.4595 | 0.1306 | 69.5238 |
2.3847 | 180.0 | 1080 | 8.4588 | 0.1312 | 69.5238 |
2.7967 | 181.0 | 1086 | 8.4586 | 0.1312 | 69.5238 |
1.6165 | 182.0 | 1092 | 8.4590 | 0.1308 | 69.6667 |
3.2699 | 183.0 | 1098 | 8.4585 | 0.1308 | 69.6667 |
2.1596 | 184.0 | 1104 | 8.4587 | 0.1308 | 69.6667 |
4.383 | 185.0 | 1110 | 8.4587 | 0.1308 | 69.6667 |
2.5019 | 186.0 | 1116 | 8.4587 | 0.1308 | 69.6667 |
2.1497 | 187.0 | 1122 | 8.4587 | 0.1308 | 69.6667 |
2.7942 | 188.0 | 1128 | 8.4594 | 0.1342 | 69.7619 |
2.5737 | 189.0 | 1134 | 8.4595 | 0.1342 | 69.7619 |
2.7013 | 190.0 | 1140 | 8.4597 | 0.1342 | 69.7619 |
4.7672 | 191.0 | 1146 | 8.4598 | 0.1342 | 69.7619 |
4.723 | 192.0 | 1152 | 8.4598 | 0.1342 | 69.7619 |
2.2355 | 193.0 | 1158 | 8.4598 | 0.1342 | 69.7619 |
1.7872 | 194.0 | 1164 | 8.4599 | 0.1342 | 69.7619 |
2.0794 | 195.0 | 1170 | 8.4600 | 0.1342 | 69.7619 |
1.6962 | 196.0 | 1176 | 8.4601 | 0.1342 | 69.7619 |
2.2855 | 197.0 | 1182 | 8.4602 | 0.1342 | 69.7619 |
2.8048 | 198.0 | 1188 | 8.4603 | 0.1346 | 69.619 |
1.8135 | 199.0 | 1194 | 8.4603 | 0.1346 | 69.619 |
2.395 | 200.0 | 1200 | 8.4603 | 0.1346 | 69.619 |
Framework versions
- Transformers 4.35.2
- Pytorch 1.13.1+cu117
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for kmok1/cs_m2m_0.00001_200_v0.2
Base model
facebook/m2m100_1.2B