Wrong prompt format in tokenizer_config.json?

by wolfram - opened Jan 22

Jan 22

The chat_template specified in tokenizer_config.json is ChatML, but apparently this model uses the (weird) GPT4 Correct prompt format. Please clarify which is the correct prompt format/chat template and kindly state it on the model card, and make sure tokenizer_config.json also has the proper template. Thank you!

mlabonne

Owner Jan 22

Hi @wolfram thanks for testing this model. I think you used an old GGUF version from TheBloke with the previous wrong tokenizer_config.json. I added an "added_tokens.json" file, maybe this helps.

If not, can you detail what changes I should make? Thanks.

wolfram

Jan 22

•

edited Jan 25

@mlabonne What's the actual chat template? In your tokenizer_config.json, the chat_template is set to ChatML, but the models your mix is made of are using a GPT4 Correct prompt format. How do you prompt it properly?

I used TheBloke's GGUF because the HF version crashed with the error message "RuntimeError: CUDA error: device-side assert triggered". Is that a known issue or just a problem on my end?

mlabonne

Owner Jan 27

Yeah, I managed to make it work with ChatML without any issues but it looks like this depends on your config. There's no pre-defined chat template. As you said, this is a merge of several models that use the GPT4 Correct prompt format, but these tokens are not implemented. I tried a few configs and I'm opting for a modified GPT4 Correct prompt format with a different eos token. I believe it's the best solution but I haven't tested it thoroughly. The CUDA error is also fixed.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment