Edit model card

long_t5_3

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1612
  • Rouge1: 0.5309
  • Rouge2: 0.3406
  • Rougel: 0.4779
  • Rougelsum: 0.4778
  • Gen Len: 30.6175

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.0161 1.0 1000 1.5665 0.4911 0.3059 0.4451 0.4451 25.5255
1.7658 2.0 2000 1.5150 0.5026 0.3142 0.4559 0.4557 26.8015
1.5969 3.0 3000 1.5031 0.51 0.3238 0.4628 0.4626 26.0075
1.4638 4.0 4000 1.5048 0.5189 0.3348 0.4724 0.4724 26.878
1.3675 5.0 5000 1.5363 0.5233 0.3369 0.4769 0.477 27.204
1.249 6.0 6000 1.5550 0.5206 0.3376 0.4762 0.4759 25.569
1.1861 7.0 7000 1.5511 0.5283 0.3444 0.4825 0.4824 26.8355
1.0985 8.0 8000 1.5838 0.5284 0.342 0.4792 0.4792 28.631
1.0178 9.0 9000 1.6231 0.5331 0.3451 0.4827 0.4828 28.7125
0.9649 10.0 10000 1.6392 0.5262 0.3384 0.4762 0.4762 29.0855
0.9069 11.0 11000 1.6758 0.5307 0.3421 0.4808 0.4804 28.9355
0.8472 12.0 12000 1.7137 0.5304 0.3458 0.481 0.4809 29.29
0.8087 13.0 13000 1.7478 0.5287 0.342 0.4789 0.4786 29.5185
0.773 14.0 14000 1.7628 0.5302 0.3436 0.4801 0.4801 29.725
0.7271 15.0 15000 1.8112 0.5293 0.3418 0.4789 0.4786 30.188
0.6919 16.0 16000 1.8520 0.5293 0.342 0.4778 0.4778 30.4125
0.665 17.0 17000 1.8738 0.5341 0.3432 0.4821 0.482 29.534
0.6242 18.0 18000 1.9228 0.5314 0.3439 0.4793 0.4792 29.2675
0.6024 19.0 19000 1.9288 0.535 0.347 0.4824 0.4823 29.852
0.5791 20.0 20000 1.9614 0.531 0.3417 0.4793 0.4791 29.754
0.5445 21.0 21000 2.0021 0.5302 0.3411 0.4784 0.4783 31.0095
0.5355 22.0 22000 2.0283 0.5318 0.3432 0.4792 0.4794 30.2985
0.5172 23.0 23000 2.0588 0.5296 0.3413 0.4775 0.4774 30.463
0.4968 24.0 24000 2.0907 0.5311 0.3423 0.4781 0.478 31.0295
0.4821 25.0 25000 2.0964 0.5318 0.3428 0.4792 0.4793 30.8365
0.4727 26.0 26000 2.1195 0.5317 0.3424 0.4789 0.4788 30.391
0.458 27.0 27000 2.1357 0.5301 0.3391 0.4761 0.4761 30.9145
0.4454 28.0 28000 2.1648 0.531 0.3409 0.4774 0.4774 31.1835
0.444 29.0 29000 2.1570 0.532 0.3418 0.4792 0.4791 30.596
0.4349 30.0 30000 2.1612 0.5309 0.3406 0.4779 0.4778 30.6175

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.2.1
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
28
Safetensors
Model size
297M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for zera09/long_t5_3

Finetuned
(18)
this model