Edit model card

cs_m2m_0.00001_200_v0.2

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 8.4603
  • Bleu: 0.1346
  • Gen Len: 69.619

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.684 1.0 6 8.4517 0.0956 61.6667
1.978 2.0 12 8.4546 0.0985 61.8095
2.8654 3.0 18 8.4538 0.0961 62.4286
2.8165 4.0 24 8.4550 0.0991 63.1905
2.6606 5.0 30 8.4556 0.0956 61.0476
3.1159 6.0 36 8.4525 0.0964 60.5238
1.813 7.0 42 8.4524 0.0961 59.8095
2.9637 8.0 48 8.4520 0.0961 59.8095
2.1663 9.0 54 8.4526 0.0918 59.5714
2.475 10.0 60 8.4516 0.0916 59.381
2.5769 11.0 66 8.4493 0.0927 60.1905
2.414 12.0 72 8.4485 0.0927 60.1905
2.5985 13.0 78 8.4500 0.0946 60.1905
2.6263 14.0 84 8.4527 0.1003 61.0
2.2439 15.0 90 8.4533 0.0774 69.0952
1.9865 16.0 96 8.4542 0.0769 69.5238
2.2472 17.0 102 8.4540 0.0766 69.7619
2.5489 18.0 108 8.4534 0.0782 70.3333
1.9181 19.0 114 8.4527 0.0789 70.5714
2.0332 20.0 120 8.4505 0.0785 70.7619
1.9397 21.0 126 8.4488 0.0784 70.9048
2.788 22.0 132 8.4480 0.0772 71.9524
2.4842 23.0 138 8.4473 0.0778 71.6667
2.3397 24.0 144 8.4459 0.0975 62.6667
2.3303 25.0 150 8.4448 0.1314 71.9048
2.6417 26.0 156 8.4436 0.1311 71.9524
2.0759 27.0 162 8.4446 0.128 71.9524
2.0973 28.0 168 8.4450 0.1659 62.1905
2.9593 29.0 174 8.4455 0.1285 71.4762
3.0086 30.0 180 8.4442 0.1624 61.8571
2.684 31.0 186 8.4431 0.162 62.0952
2.7015 32.0 192 8.4442 0.162 62.0952
4.6745 33.0 198 8.4431 0.1624 62.9048
2.1913 34.0 204 8.4427 0.1607 63.0
2.1685 35.0 210 8.4443 0.1671 61.4286
2.3458 36.0 216 8.4458 0.1346 69.6667
2.0533 37.0 222 8.4456 0.132 70.1905
3.1101 38.0 228 8.4442 0.1335 69.8095
2.2737 39.0 234 8.4447 0.0787 70.7619
2.4838 40.0 240 8.4476 0.0784 70.1905
1.9048 41.0 246 8.4487 0.0801 70.4762
2.825 42.0 252 8.4495 0.0668 79.4286
1.7811 43.0 258 8.4521 0.0639 78.2381
2.1382 44.0 264 8.4545 0.0639 78.1429
2.2783 45.0 270 8.4553 0.0636 78.5714
2.1117 46.0 276 8.4558 0.0636 78.5714
2.0165 47.0 282 8.4563 0.0638 78.4762
2.2424 48.0 288 8.4568 0.0639 78.3333
2.7404 49.0 294 8.4564 0.0627 79.5714
3.3443 50.0 300 8.4560 0.0617 78.4762
2.7281 51.0 306 8.4551 0.0617 78.4762
2.9189 52.0 312 8.4520 0.0757 70.7143
2.3192 53.0 318 8.4512 0.0754 70.7619
2.3737 54.0 324 8.4505 0.0604 78.4286
2.4041 55.0 330 8.4490 0.0606 78.0952
4.5412 56.0 336 8.4478 0.0618 78.0952
2.399 57.0 342 8.4469 0.0617 78.2381
1.8226 58.0 348 8.4467 0.062 77.9048
2.3362 59.0 354 8.4463 0.0612 77.4762
2.4263 60.0 360 8.4450 0.0612 77.4762
2.7929 61.0 366 8.4439 0.0617 78.2381
3.2633 62.0 372 8.4434 0.0615 78.3333
2.3451 63.0 378 8.4436 0.0607 77.9048
2.8337 64.0 384 8.4429 0.061 77.4762
2.7405 65.0 390 8.4430 0.0607 77.9048
2.8955 66.0 396 8.4420 0.0614 78.6667
2.3475 67.0 402 8.4408 0.061 79.0952
2.0904 68.0 408 8.4383 0.0608 79.1905
2.4816 69.0 414 8.4367 0.0607 79.3333
2.3696 70.0 420 8.4365 0.0607 79.3333
2.7587 71.0 426 8.4364 0.0616 79.5714
2.0684 72.0 432 8.4369 0.0617 79.4762
2.5021 73.0 438 8.4375 0.0617 79.4762
1.4037 74.0 444 8.4362 0.0759 71.0476
2.1197 75.0 450 8.4357 0.0763 70.7619
2.2019 76.0 456 8.4378 0.0612 78.8571
1.8674 77.0 462 8.4402 0.062 77.7619
4.6628 78.0 468 8.4415 0.0769 69.3333
2.5704 79.0 474 8.4420 0.0769 69.3333
1.8771 80.0 480 8.4422 0.0772 69.1905
1.9444 81.0 486 8.4437 0.078 70.5238
2.0133 82.0 492 8.4443 0.0771 71.1429
2.8815 83.0 498 8.4445 0.0757 70.4286
3.0573 84.0 504 8.4455 0.0621 77.7143
2.011 85.0 510 8.4469 0.0621 77.7143
1.8176 86.0 516 8.4488 0.0621 77.7143
1.505 87.0 522 8.4512 0.0621 77.7143
5.016 88.0 528 8.4542 0.0622 77.5714
4.8956 89.0 534 8.4565 0.0625 77.1905
2.3939 90.0 540 8.4578 0.0625 77.1905
1.8629 91.0 546 8.4589 0.0622 77.5714
2.7315 92.0 552 8.4599 0.0617 78.1429
2.6185 93.0 558 8.4605 0.0618 78.1429
2.2754 94.0 564 8.4598 0.0617 78.2381
1.9322 95.0 570 8.4582 0.0616 78.381
2.1725 96.0 576 8.4583 0.0621 78.9524
2.603 97.0 582 8.4576 0.0619 79.1905
2.543 98.0 588 8.4569 0.0619 79.1905
2.4981 99.0 594 8.4563 0.0618 79.2857
1.8449 100.0 600 8.4561 0.063 80.0952
3.063 101.0 606 8.4559 0.0618 79.2857
1.7031 102.0 612 8.4564 0.0622 77.7143
2.6749 103.0 618 8.4563 0.0623 77.5714
2.5504 104.0 624 8.4558 0.0781 69.4286
1.785 105.0 630 8.4559 0.0791 69.4286
2.3876 106.0 636 8.4560 0.0753 70.5238
1.9649 107.0 642 8.4556 0.0613 78.4762
2.5544 108.0 648 8.4571 0.0617 78.3333
2.3048 109.0 654 8.4578 0.0619 77.9524
3.2234 110.0 660 8.4595 0.0618 77.9524
2.5271 111.0 666 8.4600 0.0619 77.7619
2.1592 112.0 672 8.4599 0.0621 77.8571
2.1582 113.0 678 8.4600 0.0618 77.9524
5.1356 114.0 684 8.4596 0.0622 77.6667
3.1661 115.0 690 8.4594 0.0622 77.7619
2.1159 116.0 696 8.4597 0.0617 78.2381
2.1355 117.0 702 8.4602 0.0612 78.7143
2.5071 118.0 708 8.4606 0.0631 79.9524
2.5419 119.0 714 8.4608 0.0631 80.0476
2.1749 120.0 720 8.4616 0.0617 79.381
2.1737 121.0 726 8.4622 0.0631 80.0476
2.2413 122.0 732 8.4623 0.0633 79.8095
2.2636 123.0 738 8.4624 0.0636 79.4762
2.9731 124.0 744 8.4624 0.0636 79.4762
2.6207 125.0 750 8.4621 0.0636 79.4762
2.6231 126.0 756 8.4602 0.0636 79.4762
2.4161 127.0 762 8.4605 0.0637 79.381
2.9764 128.0 768 8.4613 0.0762 70.9524
2.41 129.0 774 8.4618 0.0761 71.0476
2.1357 130.0 780 8.4620 0.0762 70.7143
3.211 131.0 786 8.4621 0.0762 70.7143
1.8992 132.0 792 8.4623 0.0633 79.7143
2.9689 133.0 798 8.4621 0.0631 79.9524
2.4456 134.0 804 8.4619 0.0629 80.0476
1.9567 135.0 810 8.4620 0.063 79.8571
4.3724 136.0 816 8.4619 0.0626 79.2381
2.2729 137.0 822 8.4623 0.0626 79.2381
2.2375 138.0 828 8.4620 0.0625 78.2381
2.0507 139.0 834 8.4617 0.0625 78.2381
3.2081 140.0 840 8.4621 0.1072 78.0952
3.0478 141.0 846 8.4629 0.1072 78.0952
1.6707 142.0 852 8.4628 0.1042 77.5238
2.7035 143.0 858 8.4626 0.1042 77.5238
2.0088 144.0 864 8.4627 0.1042 77.5238
2.2061 145.0 870 8.4619 0.1042 77.5238
2.9719 146.0 876 8.4597 0.1055 76.7143
1.7429 147.0 882 8.4591 0.1335 69.0952
2.0689 148.0 888 8.4590 0.1094 77.7143
3.0878 149.0 894 8.4593 0.1094 77.7143
2.3762 150.0 900 8.4593 0.1083 78.381
1.9409 151.0 906 8.4591 0.1083 78.381
2.472 152.0 912 8.4590 0.1328 70.1905
2.1888 153.0 918 8.4590 0.1341 69.619
2.8783 154.0 924 8.4582 0.1341 69.619
2.4719 155.0 930 8.4582 0.1318 68.9524
2.4873 156.0 936 8.4579 0.1318 68.9524
2.202 157.0 942 8.4576 0.1318 68.9524
2.4128 158.0 948 8.4577 0.1318 68.9524
1.6922 159.0 954 8.4577 0.1318 68.9524
2.5719 160.0 960 8.4582 0.1318 68.9524
1.8392 161.0 966 8.4581 0.1318 68.9524
2.1349 162.0 972 8.4581 0.1318 68.9524
2.0836 163.0 978 8.4586 0.1318 68.9524
2.5173 164.0 984 8.4590 0.1318 68.9524
1.9422 165.0 990 8.4591 0.1318 68.9524
2.4949 166.0 996 8.4591 0.1318 68.9524
2.6692 167.0 1002 8.4586 0.1318 68.9524
1.5472 168.0 1008 8.4588 0.1318 68.9524
5.0693 169.0 1014 8.4589 0.1318 68.9524
2.6937 170.0 1020 8.4593 0.1318 68.9524
5.0729 171.0 1026 8.4596 0.1306 69.5238
2.645 172.0 1032 8.4599 0.1306 69.5238
1.671 173.0 1038 8.4600 0.1306 69.5238
2.329 174.0 1044 8.4600 0.1306 69.5238
2.2443 175.0 1050 8.4597 0.1306 69.5238
2.0599 176.0 1056 8.4594 0.1306 69.5238
2.0761 177.0 1062 8.4598 0.1639 60.7619
2.3301 178.0 1068 8.4595 0.1306 69.5238
2.8817 179.0 1074 8.4595 0.1306 69.5238
2.3847 180.0 1080 8.4588 0.1312 69.5238
2.7967 181.0 1086 8.4586 0.1312 69.5238
1.6165 182.0 1092 8.4590 0.1308 69.6667
3.2699 183.0 1098 8.4585 0.1308 69.6667
2.1596 184.0 1104 8.4587 0.1308 69.6667
4.383 185.0 1110 8.4587 0.1308 69.6667
2.5019 186.0 1116 8.4587 0.1308 69.6667
2.1497 187.0 1122 8.4587 0.1308 69.6667
2.7942 188.0 1128 8.4594 0.1342 69.7619
2.5737 189.0 1134 8.4595 0.1342 69.7619
2.7013 190.0 1140 8.4597 0.1342 69.7619
4.7672 191.0 1146 8.4598 0.1342 69.7619
4.723 192.0 1152 8.4598 0.1342 69.7619
2.2355 193.0 1158 8.4598 0.1342 69.7619
1.7872 194.0 1164 8.4599 0.1342 69.7619
2.0794 195.0 1170 8.4600 0.1342 69.7619
1.6962 196.0 1176 8.4601 0.1342 69.7619
2.2855 197.0 1182 8.4602 0.1342 69.7619
2.8048 198.0 1188 8.4603 0.1346 69.619
1.8135 199.0 1194 8.4603 0.1346 69.619
2.395 200.0 1200 8.4603 0.1346 69.619

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
1
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kmok1/cs_m2m_0.00001_200_v0.2

Finetuned
(13)
this model