metadata

base_model: openai/whisper-medium
datasets:
  - facebook/voxpopuli
language:
  - it
library_name: peft
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: Whisper Medium
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: facebook/voxpopuli
          type: facebook/voxpopuli
          config: default
          split: None
          args: default
        metrics:
          - type: wer
            value: 10.9375
            name: Wer

Whisper Medium

This model is a fine-tuned version of openai/whisper-medium on the facebook/voxpopuli dataset. It achieves the following results on the evaluation set:

Loss: 0.4874
Wer: 10.9375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 1200
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
2.2174	0.5714	100	1.9102	49.4792
0.2353	1.1429	200	0.3485	30.7292
0.1668	1.7143	300	0.7634	21.875
0.118	2.2857	400	0.6914	11.9792
0.0931	2.8571	500	0.5523	15.1042
0.0851	3.4286	600	0.6818	13.0208
0.0751	4.0	700	0.6348	11.9792
0.066	4.5714	800	0.6576	11.9792
0.0604	5.1429	900	0.4125	10.9375
0.0564	5.7143	1000	0.6815	10.9375
0.0499	6.2857	1100	0.4861	11.4583
0.0472	6.8571	1200	0.4874	10.9375

Framework versions

PEFT 0.12.0
Transformers 4.43.1
Pytorch 2.4.1+cu121
Datasets 3.0.0
Tokenizers 0.19.1