tlc4418
/

pythia_70m_sft

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pythia_70m_sft / README.md

tlc4418's picture

Create README.md

4f0328e verified about 1 year ago

|

history blame contribute delete

201 Bytes

	---
	datasets:
	- tatsu-lab/alpaca_farm
	---
	70m Pythia model after SFT on the AlpacaFarm dataset 'sft' split.

	Model used as a base for reward models in 'Reward Model Ensembles Mitigate Overoptimization'