StyleTTS2 Fine-tuned Model

This model is a fine-tuned version of StyleTTS2.

Model Details

  • Base Model: StyleTTS2-LibriTTS
  • Architecture: StyleTTS2
  • Task: Text-to-Speech
  • Last Checkpoint: epoch_2nd_00004.pth

Training Details

  • Total Epochs: 5
  • Batch Size: 2
  • Max Length: 120
  • Learning Rate: 0.0001
  • Loss Parameters:
    • Diff Epoch: 10
    • Joint Epoch: 110

Model Components

  • bert
  • bert_encoder
  • predictor
  • decoder
  • text_encoder
  • predictor_encoder
  • style_encoder
  • diffusion
  • text_aligner
  • pitch_extractor
  • mpd
  • msd
  • wd
Downloads last month
13
Inference Examples
Unable to determine this model's library. Check the docs .