sambanovasystems/SambaLingo-Thai-Chat · Hi. Thanks for your great models. But you should upload added

Online HF converter Gguf-my-repo raises an error when i try to convert your models to gguf: Error: Error converting to fp16: b'Traceback (most recent call last):\n File "/home/user/app/llama.cpp/convert.py", line 1548, in \n main()\n File "/home/user/app/llama.cpp/convert.py", line 1542, in main\n OutputFile.write_all(outfile, ftype, params, model, vocab, special_vocab,\n File "/home/user/app/llama.cpp/convert.py", line 1207, in write_all\n check_vocab_size(params, vocab, pad_vocab=pad_vocab)\n File "/home/user/app/llama.cpp/convert.py", line 1049, in check_vocab_size\n raise ValueError(msg)\nValueError: Vocab size mismatch (model has 57344, but SambaLingo-Hungarian-Base/tokenizer.model has 52603). Add the --pad-vocab option and try again.\n'

The same error occurs locally if i run convert.py from llama.cpp github repo on my laptop. --pad-vocab option without added_tokens.json spoils model by inserting tokens in wrong places.

sambanovasystems
/

SambaLingo-Thai-Chat

Hi. Thanks for your great models. But you should upload added_tokens.json to all of them.