mosaicml
/

mpt-7b-instruct

Text Generation

text-generation-inference

Model card Files Files and versions Community

add formatting example

#26

by sam-mosaic - opened May 20, 2023

base: refs/heads/main

←

from: refs/pr/26

Discussion Files changed

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -98,6 +98,31 @@ from transformers import AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 ```
 ## Model Description
 The architecture is a modification of a standard decoder-only transformer.

 tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
 ```
+### Formatting
+This model was trained on data formatted in the dolly-15k format:
+```python
+INSTRUCTION_KEY = "### Instruction:"
+RESPONSE_KEY = "### Response:"
+INTRO_BLURB = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
+PROMPT_FOR_GENERATION_FORMAT = """{intro}
+{instruction_key}
+{instruction}
+{response_key}
+""".format(
+    intro=INTRO_BLURB,
+    instruction_key=INSTRUCTION_KEY,
+    instruction="{instruction}",
+    response_key=RESPONSE_KEY,
+)
+example = "James decides to run 3 sprints 3 times a week. He runs 60 meters each sprint. How many total meters does he run a week? Explain before answering."
+fmt_ex = PROMPT_FOR_GENERATION_FORMAT.format(instruction=example)
+```
+In the above example, `fmt_ex` is ready to be tokenized and sent through the model.
 ## Model Description
 The architecture is a modification of a standard decoder-only transformer.