Spaces:

Dovakiins
/

qwerrwe

Build error

Nanobit commited on May 21, 2023

Commit

c22df8d

•

1 Parent(s): 68237ea

Add all dataset types

Files changed (1) hide show

README.md CHANGED Viewed

@@ -33,13 +33,32 @@ Go ahead and axolotl questions!!
 ### Dataset
-Have a dataset in one of the following format:
-- alpaca: instruction
   ```json
   {"instruction": "...", "input": "...", "output": "..."}
   ```
-- #TODO add others
 - completion: raw corpus
   ```json
   {"text": "..."}
@@ -158,7 +177,7 @@ lora_target_modules:
 lora_modules_to_save:
 #  - embed_tokens
 #  - lm_head
-lora_out_dir: # TODO: explain
 lora_fan_in_fan_out: false
 # wandb configuration if you're using it

 ### Dataset
+Have a dataset in one of the following format (JSONL recommended):
+- alpaca: instruction; input(optional)
   ```json
   {"instruction": "...", "input": "...", "output": "..."}
   ```
+- jeopardy: question and answer
+  ```json
+  {"question": "...", "category": "...", "answer": "..."}
+  ```
+- oasst: instruction
+  ```json
+  {"INSTRUCTION": "...", "RESPONSE": "..."}
+  ```
+- gpteacher: instruction; input(optional)
+  ```json
+  {"instruction": "...", "input": "...", "response": "..."}
+  ```
+- reflection: instruction with reflect; input(optional)
+  ```json
+  {"instruction": "...", "input": "...", "output": "...", "reflection": "...", "corrected": "..."}
+  ```
+- sharegpt: conversations
+  ```json
+  {"conversations": [{"from": "...", "value": "..."}]}
+  ```
 - completion: raw corpus
   ```json
   {"text": "..."}
 lora_modules_to_save:
 #  - embed_tokens
 #  - lm_head
+lora_out_dir:
 lora_fan_in_fan_out: false
 # wandb configuration if you're using it