Implementation of fonemizer

#3
by HolyWalley - opened

Hey
I want to use the model for my project, I'm not sure yet, how loaded will it be, but anyway, want to have fonemizer on my side, so, I won't overload your server + won't be dependent on it. Could you share what do you use for it?

P.S.
I'm quite far from ML, so, my question might be stupid, but...
I've tried it couple times (locally and at donar.by) it still sounds quite robotic in comparison to something that open-ai TTS can do in English (in Belarusian they have huge accent). I'm not expecting from open source model to perform same way as open ai one of course, but would love to hear you opinion, what this model lacks to sound like their? Is it possible to fine tune it with good quality recordings to make it similar to their quality? Will it make sense? Or the quality of original dataset / model architecture is a bigger problem?

Hey
I think I can share my phonemizing program, I was not the one who developed it, but I thinj it's not a problem
I think the best way to share it is just sending you archive with the code (since the core phonemizer module is just a jar file)
I can upload it on some cloud storage and give you a link, or you can give me some messenger contact and i will send it to you there

Hey, thanks a lot, my telegram is https://t.me/holywalley

Sign up or log in to comment