Vocab size mismatch?
I xored-checked the md5, and then when I was running a ggml conversion script and stumbled over this:
Vocab size mismatch (model has 32016, but /AI/text-generation-webui/models/oasst-sft-7-llama-30b/tokenizer.model combined with /AI/text-generation-webui/models/oasst-sft-7-llama-30b/added_tokens.json has 32005).
Wondering if this is a problem in the conversion script or if there is really a mismatch in the vocab size?
I xored-checked the md5, and then when I was running a ggml conversion script and stumbled over this:
Vocab size mismatch (model has 32016, but /AI/text-generation-webui/models/oasst-sft-7-llama-30b/tokenizer.model combined with /AI/text-generation-webui/models/oasst-sft-7-llama-30b/added_tokens.json has 32005).
Wondering if this is a problem in the conversion script or if there is really a mismatch in the vocab size?
You can fix it with the solution at https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor/discussions/2