metadata

license: apache-2.0
language:
  - de
library_name: transformers
pipeline_tag: automatic-speech-recognition

whisper-tiny-german

This model is a German Speech Recognition model based on the whisper-tiny model. The model weights count 37.8M parameters and with a size of 73MB in bfloat16 format.

As a follow-up to the Whisper large v3 german we decided to create a tiny version to be used in edge cases where the model size is a concern.

Intended uses & limitations

The model is intended to be used for German speech recognition tasks. It is designed to be used for edge cases where the model size is a concern. It's not recommended to use this model for critical use cases, as it is a tiny model and may not perform well in all scenarios.

Dataset

The dataset used for training is a filtered subset of the Common Voice dataset, multilingual librispeech and some internal data. The data was filtered and double checked for quality and correctness. We did some normalization to the text data, especially for casing and punctuation.

Model family

| Model | Parameters | link | |--- |--- |--- |--- |--- | | Whisper large v3 german | 1.54B | link | | Distil-whisper large v3 german | 756M | | | tiny whisper | 37.8M | |

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
total_train_batch_size: 512
num_epochs: 5.0

Framework versions

Transformers 4.39.3
Pytorch 2.3.0a0+ebedce2
Datasets 2.18.0
Tokenizers 0.15.2