lomahony/pythia-160m-helpful-dpo

Pythia-160m finetuned using original DPO code with the helpful subset of Anthropic-hh-rlhf dataset for 1 epoch.

Checkpoints are also uploaded.

Fully reproducible finetuning code is available on GitHub

See Pythia-160m for model details (paper).

See further details of these models in the paper Attributing Mode Collapse in the Fine-Tuning of Large Language Models.

You can cite these models if they are helpful as follows:

@inproceedings{o2024attributing,
  title={Attributing Mode Collapse in the Fine-Tuning of Large Language Models},
  author={O’Mahony, Laura and Grinsztajn, Leo and Schoelkopf, Hailey and Biderman, Stella},
  booktitle={ICLR 2024, Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) workshop},
  year={2024}
}

hf (pretrained=lomahony/pythia-160m-helpful-dpo), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: 16

Tasks	Version	Filter	Metric	Value		Stderr
arc_challenge	1	none	acc	0.2125	±	0.0120
		none	acc_norm	0.2312	±	0.0123
arc_easy	1	none	acc	0.3965	±	0.0100
		none	acc_norm	0.3830	±	0.0100
boolq	2	none	acc	0.5853	±	0.0086
hellaswag	1	none	acc	0.2811	±	0.0045
		none	acc_norm	0.2940	±	0.0045
lambada_openai	1	none	perplexity	444.4464	±	24.5439
		none	acc	0.1034	±	0.0042
openbookqa	1	none	acc	0.1500	±	0.0160
		none	acc_norm	0.2480	±	0.0193
piqa	1	none	acc	0.5947	±	0.0115
		none	acc_norm	0.5876	±	0.0115
sciq	1	none	acc	0.5880	±	0.0156
		none	acc_norm	0.6180	±	0.0154
wikitext	2	none	word_perplexity	88.8633	±	N/A
		none	byte_perplexity	2.3143	±	N/A
		none	bits_per_byte	1.2106	±	N/A
winogrande	1	none	acc	0.4980	±	0.0141

hf (pretrained=lomahony/pythia-160m-helpful-dpo), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 16

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
arc_challenge	1	none	5	acc	0.1928	±	0.0115
		none	5	acc_norm	0.2398	±	0.0125
arc_easy	1	none	5	acc	0.3678	±	0.0099
		none	5	acc_norm	0.3657	±	0.0099
boolq	2	none	5	acc	0.5841	±	0.0086
hellaswag	1	none	5	acc	0.2807	±	0.0045
		none	5	acc_norm	0.2876	±	0.0045
lambada_openai	1	none	5	perplexity	1607.2529	±	88.3065
		none	5	acc	0.0574	±	0.0032
openbookqa	1	none	5	acc	0.1580	±	0.0163
		none	5	acc_norm	0.2400	±	0.0191
piqa	1	none	5	acc	0.5958	±	0.0114
		none	5	acc_norm	0.5773	±	0.0115
sciq	1	none	5	acc	0.5110	±	0.0158
		none	5	acc_norm	0.5740	±	0.0156
wikitext	2	none	5	word_perplexity	88.8633	±	N/A
		none	5	byte_perplexity	2.3143	±	N/A
		none	5	bits_per_byte	1.2106	±	N/A
winogrande	1	none	5	acc	0.5162	±	0.0140

lomahony
/

pythia-160m-helpful-dpo

Dataset used to train lomahony/pythia-160m-helpful-dpo

Collection including lomahony/pythia-160m-helpful-dpo

pythia-helpful-1epoch