HiTZ
/

lmloss-opt-rm-1.3b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ixa-ehu commited on Apr 7, 2023

Commit

a68acfa

•

1 Parent(s): 3baaba5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 # LM Loss OPT RM
-This is a fine tuned OPT 1.3b model for reward modelling. The finetuning has been done on top of the full [data_name] dataset following the method presented in the paper [Training Language Models with Language Feedback at Scale](arxiv). The main results can be seen in the following table:
 | Model              | # Params  | Validation Accuracy (in %) |
 |--------------------|-----------|-------------------|

 # LM Loss OPT RM
+This is a fine tuned OPT 1.3b model for reward modelling. The finetuning has been done on top of the full [SLF5K](https://huggingface.co/datasets/JeremyAlain/SLF5K) dataset following the method presented in the paper [Training Language Models with Language Feedback at Scale](https://arxiv.org/abs/2303.16755). The main results can be seen in the following table:
 | Model              | # Params  | Validation Accuracy (in %) |
 |--------------------|-----------|-------------------|