Edit model card

license: apache-2.0 tags:

  • Summarization metrics:
  • rouge model-index:
  • name: best_model_test_0423_small results: []

best_model_test_0423_small

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6341
  • Rouge1: 18.7681
  • Rouge2: 6.3762
  • Rougel: 18.6081
  • Rougelsum: 18.6173
  • Gen Len: 22.1086

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
5.8165 0.05 1000 3.6541 11.6734 3.9865 11.5734 11.5375 18.0056
4.306 0.1 2000 3.4291 12.0417 3.8419 11.9231 11.9223 16.8948
4.1091 0.16 3000 3.3643 13.661 4.5171 13.5123 13.5076 19.4016
3.9637 0.21 4000 3.2574 13.8443 4.1761 13.689 13.6927 18.4288
3.8205 0.26 5000 3.2434 13.5371 4.3639 13.3551 13.3552 21.5776
3.7262 0.31 6000 3.1690 14.3668 4.8048 14.2191 14.1906 21.5548
3.6887 0.36 7000 3.0657 14.3265 4.436 14.212 14.205 20.89
3.6337 0.42 8000 3.0318 14.6809 4.8345 14.5378 14.5331 20.3651
3.5443 0.47 9000 3.0554 15.3372 4.9163 15.1794 15.1781 21.7742
3.5203 0.52 10000 2.9793 14.9278 4.9656 14.7491 14.743 20.8113
3.4936 0.57 11000 3.0079 15.7705 5.1453 15.5582 15.5756 23.4274
3.4592 0.62 12000 2.9721 15.0201 5.1612 14.8508 14.8198 22.7007
3.377 0.67 13000 3.0112 15.9595 5.1133 15.78 15.7774 23.4427
3.4158 0.73 14000 2.9239 14.7984 5.051 14.6943 14.6581 21.6009
3.378 0.78 15000 2.8897 16.5128 5.1923 16.3523 16.3265 22.0828
3.3231 0.83 16000 2.9347 16.9997 5.5524 16.8534 16.8737 22.5807
3.3268 0.88 17000 2.9116 16.0261 5.4226 15.9234 15.914 23.6988
3.3127 0.93 18000 2.8610 16.6255 5.3554 16.4729 16.4569 22.9481
3.2664 0.99 19000 2.8606 17.7703 5.9475 17.6229 17.6259 23.4423
3.1718 1.04 20000 2.8764 17.301 5.6262 17.122 17.1104 23.0093
3.0987 1.09 21000 2.8282 16.4718 5.2077 16.3394 16.3401 20.9697
3.1486 1.14 22000 2.8235 18.5594 5.9469 18.3882 18.3799 22.7291
3.1435 1.19 23000 2.8261 18.111 6.0309 17.9593 17.9613 22.9612
3.1049 1.25 24000 2.8068 17.124 5.5675 16.9714 16.9876 22.5558
3.1357 1.3 25000 2.8014 17.3916 5.8671 17.2148 17.2502 23.0075
3.0904 1.35 26000 2.7790 17.419 5.6689 17.3125 17.3058 22.1492
3.0877 1.4 27000 2.7462 17.0605 5.4735 16.9414 16.9378 21.7522
3.0694 1.45 28000 2.7563 17.752 5.8889 17.5967 17.619 23.2005
3.0498 1.51 29000 2.7521 17.9056 5.7754 17.7624 17.7836 21.9369
3.0566 1.56 30000 2.7468 18.6531 6.0538 18.5397 18.5038 22.2358
3.0489 1.61 31000 2.7450 18.4869 5.9297 18.3139 18.3169 22.0108
3.0247 1.66 32000 2.7449 18.5192 5.9966 18.3721 18.3569 22.2071
2.9877 1.71 33000 2.7160 18.1655 5.9294 18.0304 18.0836 21.4595
3.0383 1.76 34000 2.7202 18.4959 6.2413 18.3363 18.3431 22.9732
3.041 1.82 35000 2.6948 17.5306 5.8119 17.4011 17.4149 21.9435
2.9285 1.87 36000 2.6957 18.6418 6.1394 18.514 18.4823 22.5174
3.0556 1.92 37000 2.7000 18.7387 6.0585 18.5761 18.574 22.9315
3.0033 1.97 38000 2.6974 17.9387 6.1387 17.8271 17.8111 22.4726
2.9207 2.02 39000 2.6998 18.6073 6.1906 18.3891 18.4103 23.0274
2.8922 2.08 40000 2.6798 18.4017 6.2244 18.2321 18.2296 22.0697
2.8938 2.13 41000 2.6666 18.8016 6.2066 18.6411 18.6353 21.7017
2.9124 2.18 42000 2.6606 18.7544 6.3533 18.5923 18.5739 21.4303
2.8597 2.23 43000 2.6947 18.8672 6.4526 18.7416 18.7482 22.3352
2.8435 2.28 44000 2.6738 18.9405 6.356 18.7791 18.7729 21.9081
2.8672 2.34 45000 2.6734 18.7509 6.3991 18.6175 18.5828 21.8869
2.899 2.39 46000 2.6575 18.5529 6.3489 18.4139 18.401 21.7694
2.8616 2.44 47000 2.6485 18.7563 6.268 18.6368 18.6253 21.5685
2.8937 2.49 48000 2.6486 18.6525 6.3426 18.5184 18.5129 22.3337
2.8446 2.54 49000 2.6572 18.6529 6.2655 18.4915 18.4764 22.3331
2.8676 2.59 50000 2.6608 19.0913 6.494 18.929 18.9233 22.132
2.8794 2.65 51000 2.6583 18.7648 6.459 18.6276 18.6125 22.2414
2.8836 2.7 52000 2.6512 18.7243 6.3865 18.5848 18.5763 22.2551
2.8174 2.75 53000 2.6409 18.9393 6.3914 18.7733 18.7715 22.1243
2.8494 2.8 54000 2.6396 18.6126 6.4389 18.4673 18.4516 21.7638
2.9025 2.85 55000 2.6341 18.7681 6.3762 18.6081 18.6173 22.1086
2.8754 2.91 56000 2.6388 19.0828 6.5203 18.9334 18.9285 22.3497
2.8489 2.96 57000 2.6375 18.9219 6.4922 18.763 18.7437 21.9321

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.1+cu113
  • Datasets 2.0.0
  • Tokenizers 0.11.6
Downloads last month
730
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using yihsuan/mt5_chinese_small 1