metadata

library_name: transformers
language:
  - bn
license: mit
base_model: microsoft/speecht5_tts
tags:
  - Bengali
  - generated_from_trainer
datasets:
  - ucalyptus/train-bn
model-index:
  - name: SpeechT5-tuned-bn
    results: []

SpeechT5-tuned-bn

This model is a fine-tuned version of microsoft/speecht5_tts on the train-bn dataset. It achieves the following results on the evaluation set:

Loss: 0.5622

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 3.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
6.2372	0.1803	100	0.7144
5.5988	0.3607	200	0.6772
5.4093	0.5410	300	0.6321
5.3172	0.7214	400	0.6306
5.1628	0.9017	500	0.6069
5.1058	1.0821	600	0.6035
5.0202	1.2624	700	0.5955
5.0445	1.4427	800	0.5878
4.9277	1.6231	900	0.5814
4.9124	1.8034	1000	0.5767
4.877	1.9838	1100	0.5764
4.8186	2.1641	1200	0.5672
4.7883	2.3445	1300	0.5692
4.7329	2.5248	1400	0.5635
4.8234	2.7051	1500	0.5598
4.7006	2.8855	1600	0.5622

Framework versions

Transformers 4.46.0.dev0
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.20.1