Add tokenizer.model from the base model
#3
by
asedmammad
- opened
Add tokenizer.model from the base model (mistralai/Mistral-7B-v0.1)
Just a quick question (because we're probably going to merge this PR), is this tokenizer performing better than the current one?
This is the vanilla mistral-7b-v0.1 tokenizer.model
file which is missing in the current maral-7b repo, so the repo in it's current state was not usable for me.
I was quantizing this model and encountered spm tokenizer model not found
error in the process, this solves the issue.
Also I plan to work on researching/training a better tokenizer for persian text and hopefully can help with that in near future.
Thanks for the descriptions provided, merged.
Muhammadreza
changed pull request status to
merged