What is the prompt format?
Is it ChatML? or something else?
That's a great question, from the original model page I would have guessed chatml (it uses the im_end token) but from their hosted demo I think it's llama2 or mistral (uses [INST])
From original repo tokenizer_config.json
:
...
"chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
...
Have you guys gotten the model to run?
Sure. I use it with ollama. See here https://ollama.com/sskostyaev/starchat2-15b/tags
Yes and yes.
For submitting your own models to ollama you need to register in ollama hub(ollama.ai), configure ollama to use your key for pushes, create your own model named with your user name as prefix and then you will can push model to ollama hub.
Hi,
I currently use llamacpp python for codellama and mistral and this is my demo code for prompt format. I want to know how to include starcoder model for the same. what is the format and stop token?
Reference snippet
input_prompt = f"[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n "
else:
input_prompt = f"[INST] "
for interaction in history:
input_prompt = input_prompt + str(interaction[0]) + " [/INST] " + str(interaction[1]) + " </s><s> [INST] "
input_prompt = input_prompt + str(message) + " [/INST] "
output snippet below for stop token i currently use -
output = llm(
input_prompt,
temperature=Env.TEMPERATURE,
top_p=Env.TOP_P,
top_k=Env.TOP_K,
repeat_penalty=Env.REPEAT_PENALTY,
max_tokens=max_tokens_input,
stop=[
"<|prompter|>",
"<|endoftext|>",
"<|endoftext|> \n",
"ASSISTANT:",
"USER:",
"SYSTEM:",
],
stream=True,
)
@madhucharan for me this works fine:
{
"stop": [
"<|im_start|>",
"<|im_end|>"
]
}
Hi @NeoDim
Thanks for response. But I want to know the input prompt format as well, This is my new format snippet. Please let me know this is correct.
if use_system_prompt:
input_prompt = f"<|im_start|> system\n{system_prompt} <|im_end|>\n"
else:
input_prompt = f"<|im_start|>"
input_prompt = f"{input_prompt}user\n{str(message)}<|im_end|>\n<|im_start|>assistant\n"
output = llm(
input_prompt,
temperature=Env.TEMPERATURE,
top_p=Env.TOP_P,
top_k=Env.TOP_K,
repeat_penalty=Env.REPEAT_PENALTY,
max_tokens=max_tokens_input,
stop=[
"<|im_start|>",
"<|im_end|>"
],
stream=True,
)
@madhucharan
See this comment https://huggingface.co/bartowski/starchat2-15b-v0.1-GGUF/discussions/1#65fb102fbb78d93852b6a3ba
I don't use [INST]
tags, <s>
tags and don't insert space between <|im_start|>
and role, like system
in your message. See template here https://ollama.com/sskostyaev/starchat2-15b
@NeoDim I have followed your ollama template only and added \n wherever next line of yours was (or line ends). If you see the template after system there was next line. So Im confused now if I have to remove \n after system. There is no \n between <> and system right, It was after the system?
removed redundant lines where [INST] was there. I forgot to remove it in above comment before posting.
if use_system_prompt:
input_prompt = f"<|im_start|>system\n{system_prompt}<|im_end|>\n"
else:
input_prompt = f"<|im_start|>"
input_prompt = f"{input_prompt}user\n{str(message)}<|im_end|>\n<|im_start|>assistant\n"
@madhucharan you don't need to remove \n after system.
There is no \n between <> and system right, It was after the system?
Right.
Now your template looks right to me.
Thanks a lot for your time and support. I was confused a bit and now it got cleared, I will test this and let you know.