mlong-t5-tglobal-base

This model is a fine-tuned version of agemagician/mlong-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.1553
Rouge1: 32.0603
Rouge2: 13.4985
Rougel: 24.0775
Rougelsum: 25.9692
Gen Len: 72.828

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
No log	1.0	500	18.987	2.2709	20.5043	8.1518	16.9526	17.5001
2.8714	2.0	1000	18.982	2.2022	21.4051	8.7445	17.7534	18.3191
2.8714	3.0	1500	18.99	2.1608	21.6609	9.1753	18.0374	18.6176
2.5137	4.0	2000	18.993	2.1555	21.6818	9.1814	18.0382	18.6198
2.5137	5.0	2500	18.994	2.1462	21.9708	9.2033	18.3919	18.9535
2.3717	6.0	3000	18.996	2.1258	22.0583	9.2987	18.4379	19.0322
2.3717	7.0	3500	18.989	2.1278	21.8245	9.0474	18.1979	18.8038
2.2633	8.0	4000	18.996	2.1207	21.6273	8.8847	18.024	18.6049
2.2633	9.0	4500	18.994	2.1180	22.2004	9.6253	18.6373	19.1721
2.1886	10.0	5000	18.988	2.1220	22.1619	9.6206	18.5069	19.0856
2.1886	11.0	5500	18.987	2.1161	22.1518	9.4522	18.4695	19.0552
2.1144	12.0	6000	18.995	2.1103	22.0395	9.4185	18.4314	19.0305
2.1144	13.0	6500	18.992	2.1150	22.2404	9.4722	18.5482	19.1747
2.054	14.0	7000	19.0	2.1091	22.1466	9.3434	18.3443	18.9233
2.0526	1.0	8000	62.488	2.1580	30.4149	12.0774	22.9493	24.4478
2.1236	2.0	16000	64.797	2.1621	31.3101	13.3237	23.8249	25.526
2.0776	3.0	24000	57.059	2.1607	30.9902	12.3753	23.0243	24.8308
1.9843	4.0	32000	72.828	2.1553	32.0603	13.4985	24.0775	25.9692

Framework versions

Transformers 4.38.2
Pytorch 1.13.1+cu117
Datasets 2.18.0
Tokenizers 0.15.2

biunlp
/

mLongT5HeSum-base

mlong-t5-tglobal-base

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for biunlp/mLongT5HeSum-base

Evaluation results