metadata
language:
- 'no'
license: apache-2.0
tags:
- whisper-event
- norwegian
datasets:
- NbAiLab/NCC_S
- NbAiLab/NPSC
- NbAiLab/NST
- google/fleurs
metrics:
- wer
model-index:
- name: Whisper Tiny Norwegian Bokmål
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: FLEURS
type: google/fleurs
config: nb_no
split: test
args: nb_no
metrics:
- name: Wer
type: wer
value: 47.08
Whisper Tiny Norwegian Bokmål
This model is a fine-tuned version of openai/whisper-medium trained on several datasets.
It is currently in the middle of a large trainingi. Currently achieves the following results on the evaluation set:
- Loss: 1.464
- Wer: 47.08
Model description
The model is trained on a large corpus of roughly 5.000 hours of voice. The sources are subtitles from the Norwegian broadcaster NRK, transcribed speeches from the Norwegian parliament and voice recordings from Norsk Språkteknologi.
Intended uses & limitations
The model will be free for everyone to use when it is finished.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-06
- train_batch_size: 128
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 100.000 (currently 4.000)
- mixed_precision_training: fp16