text_shortening_model_v56

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.2446
Rouge1: 0.3315
Rouge2: 0.1705
Rougel: 0.302
Rougelsum: 0.302
Bert precision: 0.8254
Bert recall: 0.8322
Average word count: 7.3374
Max word count: 18
Min word count: 2
Average token count: 11.3745
% shortened texts with length > 12: 4.7763

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count	% shortened texts with length > 12
3.2947	1.0	288	2.7198	0.2581	0.1248	0.2329	0.2328	0.7592	0.7746	8.0751	18	0	13.4678	12.5095
2.8745	2.0	576	2.5497	0.2967	0.148	0.2692	0.269	0.8107	0.8193	7.7149	18	0	11.8552	8.3397
2.7549	3.0	864	2.4721	0.31	0.1548	0.2806	0.2805	0.8158	0.8247	7.7263	18	0	11.7786	6.975
2.6785	4.0	1152	2.4212	0.3135	0.1582	0.2834	0.2837	0.8185	0.8264	7.5815	18	0	11.6005	6.3685
2.6289	5.0	1440	2.3872	0.3188	0.1622	0.2879	0.2882	0.8196	0.8278	7.602	18	0	11.6497	6.5959
2.587	6.0	1728	2.3611	0.3224	0.1633	0.2909	0.2911	0.8202	0.8291	7.6232	18	0	11.6694	6.5959
2.5615	7.0	2016	2.3401	0.3284	0.168	0.297	0.2972	0.8222	0.8303	7.4936	18	0	11.5299	5.8378
2.5354	8.0	2304	2.3223	0.3299	0.1703	0.299	0.299	0.8228	0.831	7.5171	18	0	11.5519	5.9136
2.5074	9.0	2592	2.3069	0.3314	0.1702	0.2999	0.3	0.8237	0.832	7.5383	18	2	11.5595	5.8378
2.4868	10.0	2880	2.2944	0.3317	0.1713	0.3014	0.3013	0.8246	0.8317	7.4193	18	2	11.4519	5.5345
2.4773	11.0	3168	2.2830	0.3322	0.1705	0.3013	0.3013	0.8247	0.8319	7.3904	18	2	11.4238	5.0038
2.4571	12.0	3456	2.2738	0.3288	0.1685	0.2987	0.2987	0.8242	0.831	7.3343	18	2	11.3715	4.5489
2.4494	13.0	3744	2.2672	0.3322	0.1705	0.3013	0.3014	0.8251	0.8319	7.3351	18	2	11.3798	4.5489
2.4401	14.0	4032	2.2611	0.33	0.1692	0.3004	0.3005	0.8246	0.8315	7.3639	18	2	11.4139	4.8522
2.431	15.0	4320	2.2564	0.3303	0.1698	0.3004	0.3004	0.8248	0.8317	7.3745	18	2	11.4238	5.0796
2.4253	16.0	4608	2.2522	0.3308	0.1704	0.3016	0.3014	0.8252	0.8319	7.3328	18	2	11.3791	4.8522
2.4111	17.0	4896	2.2490	0.3313	0.1705	0.3017	0.3017	0.8254	0.8319	7.3222	18	2	11.3563	4.8522
2.4125	18.0	5184	2.2464	0.3313	0.1702	0.3017	0.3017	0.8254	0.8321	7.3328	18	2	11.3654	4.8522
2.4061	19.0	5472	2.2450	0.3313	0.1701	0.3017	0.3018	0.8254	0.8321	7.3359	18	2	11.3723	4.7763
2.4129	20.0	5760	2.2446	0.3315	0.1705	0.302	0.302	0.8254	0.8322	7.3374	18	2	11.3745	4.7763

Framework versions

Transformers 4.33.1
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

ldos
/

text_shortening_model_v56

text_shortening_model_v56

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ldos/text_shortening_model_v56

Evaluation results