Transformers
English
trl
rlhf
File size: 3,467 Bytes
ec64184
d598322
ec64184
 
41f84a5
ec64184
 
 
a804f8f
d598322
 
ec64184
 
3520807
 
 
ec64184
7bf36fd
 
 
ec64184
 
 
7bf36fd
 
 
 
ec64184
 
 
 
 
 
 
 
d598322
 
 
 
ec64184
 
d598322
 
 
 
 
ec64184
d598322
ec64184
 
7bf36fd
ec64184
7bf36fd
 
 
 
 
 
 
 
 
 
 
 
 
ec64184
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
license: bigscience-openrail-m
language:
- en
inference: false
tags:
- trl
- transformers
- rlhf
datasets:
- lvwerra/stack-exchange-paired
---

![pull_figure](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/stack-llama.png)


# Llama-se-rm-peft
Adapter weights of a reward model based on LLaMa (see Meta's LLaMA release for the original LLaMA model). 
For more info check out the [blog post](https://huggingface.co/blog/stackllama) and [github example](https://github.com/lvwerra/trl/tree/main/examples/stack_llama/scripts).



## Model Description
**Llama-se-rm** is a Llama-based model that has been first fine-tuned on the Stack Exchange dataset and used for reward modeling using a Stack Exchange Data. 
This dataset consists of questions and answers from various domains in Stack Exchange, such as programming, mathematics, physics, and more. 
The model is designed to generate human-like responses to questions in these domains. 
The model has been training to respond to prompts with the following template:

```
Question: <Query> 

Answer: <Response>
```

## Intended Uses & Limitations
The **Llama-se-rm** model was trained for long form QA using [Stack Exchange](https://stackexchange.com) data wich is released under a [CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/), and covers topics such as programming, mathematics, and physics.
It is intended to demonstrate a Large Language Model's ability to follow a target behavior (in this case, generating answers to a question that would have been rated more highly on SE).
It is not intended to replace human expertise, and answers should be validated through the use of external sources.
Further research is also needed to attribute model generations to sources in the training data, especially in cases where the model may copy answers from the training data *verbatim*.  

## Limitations and Bias
The **Llama-se-rm** model inherits limitations and biases from the Llama model and also those contained in the Stack Exchange dataset.
In particular, per the [latest developer survey for Stack Overflow](https://survey.stackoverflow.co/2022/),
which constitutes a significant part of the StackExchange data,
most users who answered the survey identified themselves as [White or European, men, between 25 and 34 years old, and based in the US (with a significant part of responders from India).](https://survey.stackoverflow.co/2022/#developer-profile-demographics)
While this demographic information likely varies by topic, disparities between the data contributors and the direct and indirect users of the technology should inform developers in assessing what constitutes an appropriate use case.

Additionally, the model may generate answers that are incorrect or misleading due to the inherent limitations of the Llama architecture.
## BibTeX entry and citation info


```bibtex
@misc {beeching2023stackllama,
	author       = { Edward Beeching and
                     Younes Belkada and
                     Kashif Rasul and
                     Lewis Tunstall and
                     Leandro von Werra and
                     Nazneen Rajani and
                     Nathan Lambert
                   },
	title        = { StackLLaMa: An RL Fine-tuned LLaMa Model for Stack Exchange Question and Answering },
	year         = 2023,
	url          = { https://huggingface.co/trl-lib/llama-7b-se-rm-peft },
	publisher    = { Hugging Face Blog }
}
```