bookbot
/

vits-base-sw-KE-OpenBible

Model card Files Files and versions Metrics Training metrics Community

vits-base-sw-KE-OpenBible / README.md

w11wo's picture

Added Model

aa6d0a8 11 months ago

|

history blame contribute delete

1.74 kB

	---
	language: sw
	license: cc-by-sa-4.0
	tags:
	- audio
	- text-to-speech
	inference: false
	datasets:
	- bookbot/OpenBible_Swahili
	---

	# VITS Base sw-KE-OpenBible

	VITS Base sw-KE-OpenBible is an end-to-end text-to-speech model based on the [VITS](https://arxiv.org/abs/2106.06103) architecture. This model was trained from scratch on a real audio dataset. The list of real speakers include:

	- sw-KE-OpenBible

	The model's [vocabulary](https://huggingface.co/bookbot/vits-base-sw-KE-OpenBible/blob/main/symbols.py) contains the different IPA phonemes found in [gruut](https://github.com/rhasspy/gruut).

	This model was trained using [VITS](https://github.com/jaywalnut310/vits) framework. All training was done on a Scaleway L40S VM with a NVIDIA L40S GPU. All necessary scripts used for training could be found in the [Files and versions](https://huggingface.co/bookbot/vits-base-sw-KE-OpenBible/tree/main) tab, as well as the [Training metrics](https://huggingface.co/bookbot/vits-base-sw-KE-OpenBible/tensorboard) logged via Tensorboard.

	## Model

	\| Model \| SR (Hz) \| Mel range (Hz) \| FFT / Hop / Win \| #epochs \|
	\| ------------------------- \| ------- \| -------------- \| ----------------- \| ------- \|
	\| VITS Base sw-KE-OpenBible \| 44.1K \| 0-null \| 2048 / 512 / 2048 \| 12000 \|

	## Training procedure

	### Prepare Data

	```sh
	python preprocess.py \
	--text_index 1 \
	--filelists filelists/sw-KE-OpenBible_text_train_filelist.txt filelists/sw-KE-OpenBible_text_val_filelist.txt \
	--text_cleaners swahili_cleaners
	```

	### Train

	```sh
	python train.py -c configs/sw_ke_openbible_base.json -m sw_ke_openbible_base
	```

	## Frameworks

	- PyTorch 2.2.2
	- [VITS](https://github.com/bookbot-hive/vits)