Add all dataset types
Browse files
README.md
CHANGED
@@ -33,13 +33,32 @@ Go ahead and axolotl questions!!
|
|
33 |
|
34 |
### Dataset
|
35 |
|
36 |
-
Have a dataset in one of the following format:
|
37 |
|
38 |
-
- alpaca: instruction
|
39 |
```json
|
40 |
{"instruction": "...", "input": "...", "output": "..."}
|
41 |
```
|
42 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
- completion: raw corpus
|
44 |
```json
|
45 |
{"text": "..."}
|
@@ -158,7 +177,7 @@ lora_target_modules:
|
|
158 |
lora_modules_to_save:
|
159 |
# - embed_tokens
|
160 |
# - lm_head
|
161 |
-
lora_out_dir:
|
162 |
lora_fan_in_fan_out: false
|
163 |
|
164 |
# wandb configuration if you're using it
|
|
|
33 |
|
34 |
### Dataset
|
35 |
|
36 |
+
Have a dataset in one of the following format (JSONL recommended):
|
37 |
|
38 |
+
- alpaca: instruction; input(optional)
|
39 |
```json
|
40 |
{"instruction": "...", "input": "...", "output": "..."}
|
41 |
```
|
42 |
+
- jeopardy: question and answer
|
43 |
+
```json
|
44 |
+
{"question": "...", "category": "...", "answer": "..."}
|
45 |
+
```
|
46 |
+
- oasst: instruction
|
47 |
+
```json
|
48 |
+
{"INSTRUCTION": "...", "RESPONSE": "..."}
|
49 |
+
```
|
50 |
+
- gpteacher: instruction; input(optional)
|
51 |
+
```json
|
52 |
+
{"instruction": "...", "input": "...", "response": "..."}
|
53 |
+
```
|
54 |
+
- reflection: instruction with reflect; input(optional)
|
55 |
+
```json
|
56 |
+
{"instruction": "...", "input": "...", "output": "...", "reflection": "...", "corrected": "..."}
|
57 |
+
```
|
58 |
+
- sharegpt: conversations
|
59 |
+
```json
|
60 |
+
{"conversations": [{"from": "...", "value": "..."}]}
|
61 |
+
```
|
62 |
- completion: raw corpus
|
63 |
```json
|
64 |
{"text": "..."}
|
|
|
177 |
lora_modules_to_save:
|
178 |
# - embed_tokens
|
179 |
# - lm_head
|
180 |
+
lora_out_dir:
|
181 |
lora_fan_in_fan_out: false
|
182 |
|
183 |
# wandb configuration if you're using it
|