|
--- |
|
language: |
|
- ru |
|
tags: |
|
- vits |
|
license: cc-by-nc-4.0 |
|
pipeline_tag: text-to-speech |
|
widget: |
|
- example_title: text to speech |
|
text: > |
|
прив+ет, как дел+а? всё +очень хорош+о! а у тебя как? |
|
--- |
|
|
|
# VITS model Text to Speech Russian |
|
|
|
The text accepts lowercase |
|
|
|
Example Text to Speech |
|
|
|
```python |
|
from transformers import VitsModel, AutoTokenizer |
|
import torch |
|
import scipy |
|
|
|
model = VitsModel.from_pretrained("joefox/tts_vits_ru_hf") |
|
tokenizer = AutoTokenizer.from_pretrained("joefox/tts_vits_ru_hf") |
|
|
|
text = "Привет, как дел+а? Всё +очень хорош+о! А у тебя как?" |
|
text = text.lower() |
|
inputs = tokenizer(text, return_tensors="pt") |
|
inputs['speaker_id'] = 3 |
|
|
|
with torch.no_grad(): |
|
output = model(**inputs).waveform |
|
|
|
scipy.io.wavfile.write("techno.wav", rate=model.config.sampling_rate, data=output[0].cpu().numpy()) |
|
``` |
|
|
|
|
|
|
|
For displayed in a Jupyter Notebook / Google Colab: |
|
|
|
```python |
|
from IPython.display import Audio |
|
|
|
Audio(output, rate=model.config.sampling_rate) |
|
``` |
|
|
|
## |
|
|
|
## Languages covered |
|
|
|
Russian (ru_RU) |
|
|