Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,9 @@ tags:
|
|
8 |
- reinforcement-learning
|
9 |
---
|
10 |
|
|
|
|
|
|
|
11 |
# Llama-se-rm-peft
|
12 |
Adapter weights of a reward model based on LLaMa. Authored by Edward Beeching, Younes Belkada, Kashif Rasul, Lewis Tunstall and Leandro von Werra.
|
13 |
For more info check out the [blog post]() and [github example]().
|
|
|
8 |
- reinforcement-learning
|
9 |
---
|
10 |
|
11 |
+
![pull_figure](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/stack-llama.png)
|
12 |
+
|
13 |
+
|
14 |
# Llama-se-rm-peft
|
15 |
Adapter weights of a reward model based on LLaMa. Authored by Edward Beeching, Younes Belkada, Kashif Rasul, Lewis Tunstall and Leandro von Werra.
|
16 |
For more info check out the [blog post]() and [github example]().
|