metadata

license: apache-2.0

KhanomTan TTS v1.0

KhanomTan TTS (ขนมตาล) is a open-source Thai text-to-speech model that supports multilingual speakers. It supports Thai, English, and others.

KhanomTan TTS is a YourTTS model that trained with supports Thai. We add the Thai speech corpus from TSync 1* and TSync 2* to mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS that train the model with YourTTS model.

Config

We have Thai characters to the graphemes config for training the model and use the Speaker Encoder model from the speaker encoder model from 🐸 Coqui-TTS.

Dataset

We add Tsync 1 corpus and Tsync 2 corpus, which are not complete datasets, and then add those to mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS dataset.

Trained the model

We use the 🐸 Coqui-TTS multilingual VITS-model recipe (version 0.7.1 or the commit id is d46fbc240ccf21797d42ac26cb27eb0b9f8d31c4) for training the model, and we use the speaker encoder model from 🐸 Coqui-TTS then we release the best model to public access.

Model cards: https://github.com/wannaphong/KhanomTan-TTS-v1.0
Dataset (Tsync 1 and Tsync 2 only): https://huggingface.co/datasets/wannaphong/tsync1-2-yourtts
GitHub: https://github.com/wannaphong/KhanomTan-TTS-v1.0

*Note: Those are not complete corpus. We can access the public corpus only.