Nanobit commited on
Commit
c22df8d
1 Parent(s): 68237ea

Add all dataset types

Browse files
Files changed (1) hide show
  1. README.md +23 -4
README.md CHANGED
@@ -33,13 +33,32 @@ Go ahead and axolotl questions!!
33
 
34
  ### Dataset
35
 
36
- Have a dataset in one of the following format:
37
 
38
- - alpaca: instruction
39
  ```json
40
  {"instruction": "...", "input": "...", "output": "..."}
41
  ```
42
- - #TODO add others
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  - completion: raw corpus
44
  ```json
45
  {"text": "..."}
@@ -158,7 +177,7 @@ lora_target_modules:
158
  lora_modules_to_save:
159
  # - embed_tokens
160
  # - lm_head
161
- lora_out_dir: # TODO: explain
162
  lora_fan_in_fan_out: false
163
 
164
  # wandb configuration if you're using it
 
33
 
34
  ### Dataset
35
 
36
+ Have a dataset in one of the following format (JSONL recommended):
37
 
38
+ - alpaca: instruction; input(optional)
39
  ```json
40
  {"instruction": "...", "input": "...", "output": "..."}
41
  ```
42
+ - jeopardy: question and answer
43
+ ```json
44
+ {"question": "...", "category": "...", "answer": "..."}
45
+ ```
46
+ - oasst: instruction
47
+ ```json
48
+ {"INSTRUCTION": "...", "RESPONSE": "..."}
49
+ ```
50
+ - gpteacher: instruction; input(optional)
51
+ ```json
52
+ {"instruction": "...", "input": "...", "response": "..."}
53
+ ```
54
+ - reflection: instruction with reflect; input(optional)
55
+ ```json
56
+ {"instruction": "...", "input": "...", "output": "...", "reflection": "...", "corrected": "..."}
57
+ ```
58
+ - sharegpt: conversations
59
+ ```json
60
+ {"conversations": [{"from": "...", "value": "..."}]}
61
+ ```
62
  - completion: raw corpus
63
  ```json
64
  {"text": "..."}
 
177
  lora_modules_to_save:
178
  # - embed_tokens
179
  # - lm_head
180
+ lora_out_dir:
181
  lora_fan_in_fan_out: false
182
 
183
  # wandb configuration if you're using it