ixa-ehu commited on
Commit
a68acfa
1 Parent(s): 3baaba5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
 
7
  # LM Loss OPT RM
8
 
9
- This is a fine tuned OPT 1.3b model for reward modelling. The finetuning has been done on top of the full [data_name] dataset following the method presented in the paper [Training Language Models with Language Feedback at Scale](arxiv). The main results can be seen in the following table:
10
 
11
  | Model | # Params | Validation Accuracy (in %) |
12
  |--------------------|-----------|-------------------|
 
6
 
7
  # LM Loss OPT RM
8
 
9
+ This is a fine tuned OPT 1.3b model for reward modelling. The finetuning has been done on top of the full [SLF5K](https://huggingface.co/datasets/JeremyAlain/SLF5K) dataset following the method presented in the paper [Training Language Models with Language Feedback at Scale](https://arxiv.org/abs/2303.16755). The main results can be seen in the following table:
10
 
11
  | Model | # Params | Validation Accuracy (in %) |
12
  |--------------------|-----------|-------------------|