whaleloops commited on
Commit
4b1130d
1 Parent(s): 074ef30

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -2
README.md CHANGED
@@ -2,6 +2,55 @@
2
  license: apache-2.0
3
  ---
4
 
5
- Replicate of https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca
6
 
7
- But in safetensor format
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
+ This is a replicate of https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca
6
 
7
+ But in safetensor format
8
+
9
+
10
+ # Prompt Template
11
+
12
+ To use the prompt for further training and inference, please use [OpenAI's Chat Markup Language (ChatML)](https://github.com/openai/openai-python/blob/main/chatml.md) format, with `<|im_start|>` and `<|im_end|>` tokens added to support this.
13
+
14
+ This means that, e.g., in [oobabooga](https://github.com/oobabooga/text-generation-webui/) the "`MPT-Chat`" instruction template should work, as it also uses ChatML.
15
+
16
+ This formatting is also available via a pre-defined [Transformers chat template](https://huggingface.co/docs/transformers/main/chat_templating),
17
+ which means that lists of messages can be formatted for you with the `apply_chat_template()` method:
18
+
19
+ ```python
20
+ chat = [
21
+ {"role": "system", "content": "You are MistralOrca, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!"}
22
+ {"role": "user", "content": "How are you?"},
23
+ {"role": "assistant", "content": "I am doing well!"},
24
+ {"role": "user", "content": "Please tell me about how mistral winds have attracted super-orcas."},
25
+ ]
26
+ tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
27
+ ```
28
+
29
+ which will yield:
30
+
31
+ ```
32
+ <|im_start|>system
33
+ You are MistralOrca, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!
34
+ <|im_end|>
35
+ <|im_start|>user
36
+ How are you?<|im_end|>
37
+ <|im_start|>assistant
38
+ I am doing well!<|im_end|>
39
+ <|im_start|>user
40
+ Please tell me about how mistral winds have attracted super-orcas.<|im_end|>
41
+ <|im_start|>assistant
42
+ ```
43
+
44
+ If you use `tokenize=True` and `return_tensors="pt"` instead, then you will get a tokenized
45
+ and formatted conversation ready to pass to `model.generate()`.
46
+
47
+
48
+ # Inference
49
+
50
+ See [this notebook](https://colab.research.google.com/drive/1yZlLSifCGELAX5GN582kZypHCv0uJuNX?usp=sharing) for inference details.
51
+
52
+ Note that you need the development snapshot of Transformers currently, as support for Mistral hasn't been released into PyPI yet:
53
+
54
+ ```
55
+ pip install git+https://github.com/huggingface/transformers
56
+ ```