Pretrained Model of Amphion NaturalSpeech2

We provide the pre-trained checkpoint of NaturalSpeech2 trained on LibriTTS, which is is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz.

Note that the current model is only trained on libritts, and the amount of training data is much less than the 5.5w hours of the original paper. We will soon introduce models trained on large-scale data. Please stay tuned.

Quick Start

To utilize the pretrained models, just run the following commands:

Step1: Download the checkpoint

git lfs install
git clone https://huggingface.co/amphion/naturalspeech2_libritts

Step2: Clone the Amphion's Source Code of GitHub

git clone https://github.com/open-mmlab/Amphion.git

Step3: Specify the checkpoint's path

Use the soft link to specify the downloaded checkpoint in the first step:

cd Amphion
mkdir -p ckpts
ln -s  ../../../naturalspeech2_libritts  ckpts/tts/

Step4: Inference

You can follow the inference part of this recipe to generate speech from text.

We also provided an online demo, feel free to try it!