german-jeopardy-longt5-large-256

This model is a fine-tuned version of google/long-t5-tglobal-large on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:

Loss: 2.8541
Brevity Penalty: 0.8795
System Length: 18427
Reference Length: 20793
ROUGE-1: 23.88
ROUGE-2: 8.54
ROUGE-L: 23.14
ROUGE-Lsum: 23.13
Exact Match: 0.32
BLEU: 4.87
F1: 23.82

Model description

See google/long-t5-tglobal-large for more information about the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

Intended uses & limitations

This model can be used for question generation on German text.

Training and evaluation data

See lmqg/qg_dequad.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 2
eval_batch_size: 2
seed: 7
gradient_accumulation_steps: 128
total_train_batch_size: 256
optimizer: Adafactor
lr_scheduler_type: constant
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Counts 1	Counts 2	Counts 3	Counts 4	Totals 1	Totals 2	Totals 3	Totals 4	Precisions 1	Precisions 2	Precisions 3	Precisions 4	Brevity Penalty	System Length	Reference Length	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum	Exact Match	BLEU	Mean Generated Length	F1
8.8727	0.99	36	6.3810	2198	0	0	0	2204	0	0	0	99.7278	0.0	0.0	0.0	0.0002	2204	21250	0.0	0.0	0.0	0.0	0.0	0.0	2.0	0.0
6.0165	1.98	72	5.3864	3587	137	0	0	21960	19756	17552	15348	16.3342	0.6935	0.0028	0.0016	1.0	21960	21250	0.0702	0.0079	0.07	0.07	0.0	0.0851	15.0091	0.073
5.1537	3.0	109	4.9617	3601	145	1	0	14449	12245	10041	7837	24.9221	1.1842	0.01	0.0064	0.6246	14449	21250	0.0882	0.0107	0.0877	0.0876	0.0	0.13	9.5309	0.0926
4.863	3.99	145	4.5531	4590	229	19	0	41674	39470	37266	35062	11.0141	0.5802	0.051	0.0014	1.0	41674	21250	0.0811	0.0081	0.0768	0.0767	0.0	0.1468	29.4528	0.0836
4.5201	4.97	181	4.2020	3643	169	19	0	16104	13900	11696	9492	22.6217	1.2158	0.1624	0.0053	0.7265	16104	21250	0.0865	0.0115	0.0856	0.0855	0.0	0.2845	12.5077	0.0907
4.1347	5.99	218	3.9353	3670	167	20	0	16796	14592	12388	10184	21.8504	1.1445	0.1614	0.0049	0.7671	16796	21250	0.087	0.0114	0.0859	0.0858	0.0	0.2878	13.1656	0.0917
4.012	6.98	254	3.7593	3780	198	35	1	16582	14378	12174	9970	22.7958	1.3771	0.2875	0.01	0.7546	16582	21250	0.0916	0.0128	0.0903	0.0902	0.0	0.4139	12.2931	0.0968
3.7048	8.0	291	3.6034	3668	205	36	3	16158	13954	11750	9546	22.7008	1.4691	0.3064	0.0314	0.7297	16158	21250	0.0882	0.0134	0.0873	0.0872	0.0	0.5493	11.7568	0.0923
3.6284	8.99	327	3.4567	4070	527	160	28	17459	15255	13051	10847	23.3118	3.4546	1.226	0.2581	0.8048	17459	21250	0.1109	0.0281	0.1083	0.1082	0.0	1.8083	9.7777	0.1152
3.4605	9.98	363	3.3390	4325	512	128	27	18829	16625	14421	12217	22.9699	3.0797	0.8876	0.221	0.8793	18829	21250	0.1206	0.0288	0.1168	0.1167	0.0	1.6972	12.6729	0.1254
3.2267	10.99	400	3.1995	4498	774	237	49	18802	16598	14394	12190	23.923	4.6632	1.6465	0.402	0.8779	18802	21250	0.1348	0.0405	0.132	0.1319	0.0005	2.5735	11.5009	0.1381
3.1761	11.98	436	3.1165	4578	866	260	50	16963	14759	12555	10351	26.9882	5.8676	2.0709	0.483	0.7767	16963	21250	0.1454	0.0464	0.1426	0.1427	0.0005	2.7554	10.5172	0.1492
3.0323	12.97	472	3.0074	5019	1048	319	59	18077	15873	13669	11465	27.7646	6.6024	2.3337	0.5146	0.839	18077	21250	0.1691	0.0557	0.1648	0.1647	0.0009	3.2318	12.8294	0.1729
2.8223	13.99	509	2.8911	5257	1120	341	85	17074	14870	12666	10462	30.7895	7.5319	2.6922	0.8125	0.783	17074	21250	0.189	0.0635	0.1841	0.184	0.0018	3.7161	12.6824	0.1929
2.7732	14.98	545	2.8103	5616	1271	407	113	17784	15580	13376	11172	31.5789	8.1579	3.0428	1.0115	0.8229	17784	21250	0.2122	0.0731	0.2063	0.2061	0.0045	4.3667	13.0944	0.217
2.58	16.0	582	2.7183	5959	1461	510	171	18808	16604	14400	12196	31.6833	8.7991	3.5417	1.4021	0.8782	18808	21250	0.2286	0.0822	0.2214	0.2212	0.0064	5.357	13.9174	0.2316
2.5368	16.99	618	2.6630	5935	1543	576	201	16923	14719	12515	10311	35.0706	10.483	4.6025	1.9494	0.7744	16923	21250	0.2365	0.089	0.2309	0.2307	0.0059	5.8686	12.3185	0.2377
2.4325	17.98	654	2.5798	6305	1756	685	265	17870	15666	13462	11258	35.2826	11.209	5.0884	2.3539	0.8277	17870	21250	0.2518	0.0982	0.2452	0.2452	0.0059	6.8664	13.1688	0.2537
2.2632	18.99	691	2.5155	6577	1888	762	304	17785	15581	13377	11173	36.9806	12.1173	5.6963	2.7208	0.823	17785	21250	0.2689	0.1102	0.261	0.2611	0.0086	7.5129	13.2373	0.2702
2.2026	19.79	720	2.4997	6644	1853	720	273	17658	15454	13250	11046	37.626	11.9904	5.434	2.4715	0.8159	17658	21250	0.2717	0.1097	0.2628	0.2625	0.0073	7.1987	13.6343	0.2742

Framework versions

Transformers 4.32.1
Pytorch 2.1.0
Datasets 2.12.0
Tokenizers 0.13.3

GiantTreeG
/

german-jeopardy-longt5-large-256