Fine-tuning/Tokenizer
Hello, I have been trying to fine-tune your model. At that time I didn't see a discussion section, so I created my tokenizer on text from that dataset. I have been fine-tuning your model with my tokenizer and at the end I got some German-sounding voice of a girl, lol. I have been wondering, did you convert your text to Latin or not? Because, when I try to tokenize on the default one, I get something like this.
So as I understand, you converted it into Latin. Can you please share the tool you used for converting it to latin, please.
Hi, Mukhamejan! No, I didn't convert the text to latin.
You have done a lot of work to create this model. It has great potential, but unfortunately, it puts the accent wrong in many words of the Russian language. How to fix and correct it?