File size: 968 Bytes
8319111
 
769947f
 
8319111
22ca975
769947f
0b81f97
1e5901d
22ca975
5ca640c
 
 
 
 
 
14f8ff5
22ca975
 
 
 
 
14f8ff5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: cc-by-nc-4.0
language:
- en
---
Models trained from [VITS-fast-fine-tuning](https://github.com/Plachtaa/VITS-fast-fine-tuning)
- Three speakers: laoliang (老撁), specialweek, zhongli.
- The model is based on the C+J base model and trained on a single NVIDIA 3090 with 300 epochs. It takes about 4.5 hours in total.
- During training, we use a single long audio of laoliang (~5 minutes) with auxiliary data as training data.

How to run the model?
- Follow [the official instruction](https://github.com/Plachtaa/VITS-fast-fine-tuning/blob/main/LOCAL.md), install required libraries.
- Download models and move _finetune_speaker.json_ and _G_latest.pth_ to _/path/to/ VITS-fast-fine-tuning_.
- Run _python VC_inference.py --model_dir ./G_latest.pth --share True_ to start a local gradio inference demo.

File structure
```bash
VITS-fast-fine-tuning
β”œβ”€β”€β”€VC_inference.py
β”œβ”€β”€β”€...
β”œβ”€β”€β”€finetune_speaker.json
└───G_latest.pth
```