speecht5_finetuned_voxpopuli_de

This model is a fine-tuned version of microsoft/speecht5_tts on the voxpopuli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4484

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.5233 2.2783 1000 0.4829
0.5036 4.5566 2000 0.4684
0.503 6.8349 3000 0.4616
0.4895 9.1118 4000 0.4577
0.486 11.3901 5000 0.4537
0.4835 13.6684 6000 0.4524
0.4757 15.9467 7000 0.4511
0.4771 18.2236 8000 0.4504
0.4745 20.5019 9000 0.4488
0.474 22.7802 10000 0.4479
0.4697 25.0570 11000 0.4493
0.4673 27.3353 12000 0.4485
0.4716 29.6136 13000 0.4481
0.4651 31.8919 14000 0.4482
0.4699 34.1688 15000 0.4471
0.4613 36.4471 16000 0.4481
0.4655 38.7254 17000 0.4478
0.4601 41.0023 18000 0.4468
0.4602 43.2806 19000 0.4454
0.4613 45.5589 20000 0.4469
0.4606 47.8372 21000 0.4467
0.4546 50.1141 22000 0.4479
0.4545 52.3924 23000 0.4465
0.4556 54.6707 24000 0.4470
0.4578 56.9490 25000 0.4466
0.4564 59.2258 26000 0.4466
0.4566 61.5041 27000 0.4480
0.457 63.7824 28000 0.4470
0.4531 66.0593 29000 0.4493
0.4521 68.3376 30000 0.4478
0.4527 70.6159 31000 0.4488
0.4513 72.8942 32000 0.4479
0.455 75.1711 33000 0.4478
0.4533 77.4494 34000 0.4486
0.4565 79.7277 35000 0.4473
0.452 82.0046 36000 0.4489
0.4523 84.2829 37000 0.4477
0.4523 86.5612 38000 0.4481
0.4536 88.8395 39000 0.4481
0.4512 91.1163 40000 0.4484

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
308
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SverreNystad/speecht5_finetuned_voxpopuli_de

Finetuned
(1255)
this model