marianbasti
/

XTTS-v2-argentinian-spanish

Model card Files Files and versions Community

marianbasti commited on Jul 2, 2024

Commit

0caff00

·

verified ·

1 Parent(s): abcad42

Update README.md

Files changed (1) hide show

README.md +68 -5

README.md CHANGED Viewed

@@ -1,5 +1,68 @@
----
-license: other
-license_name: coqui-public-model-license
-license_link: LICENSE
----

+---
+license: other
+license_name: coqui-public-model-license
+license_link: https://coqui.ai/cpml
+library_name: coqui
+pipeline_tag: text-to-speech
+datasets:
+- ylacombe/google-argentinian-spanish
+language:
+- es
+---
+# ⓍTTS 🇦🇷
+ⓍTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. There is no need for an excessive amount of training data that spans countless hours.
+This is the same or similar model to what powers [Coqui Studio](https://coqui.ai/) and [Coqui API](https://docs.coqui.ai/docs).
+### Language
+This model's Spanish language has been finetuned using [ylacombe's google argentinian spanish dataset](https://huggingface.co/datasets/ylacombe/google-argentinian-spanish) to archieve an argentinian accent.
+### Training Parameters
+```
+batch_size=8,
+grad_accum_steps=96,
+batch_group_size=48,
+eval_batch_size=8,
+num_loader_workers=8,
+eval_split_max_size=256,
+optimizer="AdamW",
+optimizer_wd_only_on_weights=True,
+optimizer_params={"betas": [0.9, 0.96], "eps": 1e-8, "weight_decay": 1e-2},
+lr=5e-06,
+lr_scheduler="MultiStepLR",
+lr_scheduler_params={"milestones": [50000 * 18, 150000 * 18, 300000 * 18], "gamma": 0.5, "last_epoch": -1},
+```
+### License
+This model is licensed under [Coqui Public Model License](https://coqui.ai/cpml). There's a lot that goes into a license for generative models, and you can read more of [the origin story of CPML here](https://coqui.ai/blog/tts/cpml).
+Using 🐸TTS Command line:
+```console
+ tts --model_name /path/to/xtts/ \
+     --text "Che boludo, vamos a tomar unos mates." \
+     --speaker_wav /path/to/target/speaker.wav \
+     --language_idx es \
+     --use_cuda true
+```
+Using the model directly:
+```python
+from TTS.tts.configs.xtts_config import XttsConfig
+from TTS.tts.models.xtts import Xtts
+config = XttsConfig()
+config.load_json("/path/to/xtts/config.json")
+model = Xtts.init_from_config(config)
+model.load_checkpoint(config, checkpoint_dir="/path/to/xtts/", eval=True)
+model.cuda()
+outputs = model.synthesize(
+    "Che boludo, vamos a tomar unos mates.",
+    config,
+    speaker_wav="/data/TTS-public/_refclips/3.wav",
+    gpt_cond_len=3,
+    language="es",
+)
+```