StyleTTS2 Fine-tuned Model
This model is a fine-tuned version of StyleTTS2.
Model Details
- Base Model: StyleTTS2-LibriTTS
- Architecture: StyleTTS2
- Task: Text-to-Speech
- Last Checkpoint: epoch_2nd_00004.pth
Training Details
- Total Epochs: 5
- Completed Epochs: 4
- Total Iterations: 411
- Batch Size: 2
- Max Length: 120
- Learning Rate: 0.0001
- Final Validation Loss: 0.430844
Loss Parameters
- Diff Epoch: 10
- Joint Epoch: 110
- Lambda Parameters:
- Mel: 5.0
- F0: 1.0
- Duration: 1.0
- Style: 1.0
Model Components
- bert
- bert_encoder
- predictor
- decoder
- text_encoder
- predictor_encoder
- style_encoder
- diffusion
- text_aligner
- pitch_extractor
- mpd
- msd
- wd
Training Metrics
Training metrics visualization is available in training_metrics.png
- Downloads last month
- 13