juliensalinas
commited on
Commit
•
8bdd61d
1
Parent(s):
47b2a8b
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,33 @@
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
---
|
4 |
+
|
5 |
+
This model demonstrates that GPT-J can work perfectly well as an "instruct" model when properly fine-tuned.
|
6 |
+
|
7 |
+
We fine-tuned GPT-J on an instruction dataset created by the [Stanford Alpaca team](https://github.com/tatsu-lab/stanford_alpaca). You can find the original dataset [here](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json).
|
8 |
+
|
9 |
+
The dataset was slightly reworked in order to match the GPT-J fine-tuning format with [Mesh Transformer Jax](https://github.com/kingoflolz/mesh-transformer-jax) on TPUs. [Here is the final dataset we used](https://huggingface.co/datasets/nlpcloud/instructions-dataset-adapted-from-stanford-alpaca-for-gpt-j).
|
10 |
+
|
11 |
+
The base GPT-J models needs few-shot learning in order to properly understand what you want. [See more details here about how to properly use few-shot learning](https://nlpcloud.com/effectively-using-gpt-j-gpt-neo-gpt-3-alternatives-few-shot-learning.html). For example let's say that you want to correct spelling with GPT-J. Here is an example of a prompt you had to use:
|
12 |
+
|
13 |
+
```text
|
14 |
+
I love goin to the beach.
|
15 |
+
Correction: I love going to the beach.
|
16 |
+
###
|
17 |
+
Let me hav it!
|
18 |
+
Correction: Let me have it!
|
19 |
+
###
|
20 |
+
It have too many drawbacks.
|
21 |
+
Correction: It has too many drawbacks.
|
22 |
+
###
|
23 |
+
I do not wan to go
|
24 |
+
Correction:
|
25 |
+
```
|
26 |
+
|
27 |
+
Now, with Instruct GPT-J, here is what you can do:
|
28 |
+
|
29 |
+
```text
|
30 |
+
Correct spelling and grammar from the following text.
|
31 |
+
I do not wan to go
|
32 |
+
```
|
33 |
+
|