metadata
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- mozilla-foundation/common_voice_12_0
metrics:
- wer
model-index:
- name: wav2vec2-large-xls-r-1b-frisian
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: common_voice_12_0
type: common_voice_12_0
config: fy-NL
split: test
args: fy-NL
metrics:
- name: Wer
type: wer
value: 0.15990775235054105
language:
- fy
wav2vec2-large-xls-r-1b-frisian
This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the common_voice_12_0 dataset. It achieves the following results on the evaluation set:
- Loss: 0.2634
- WER: 0.1599
This model was developed together with golesheed for the course "Speech Recognition II" of the "MSc Voice Technology" program at Rijksuniversiteit Groningen - Campus Fryslân.
Intended uses & limitations
Intended use is for recognizing Frisian speech.
Limitations include not enough hyperparameter tuning, no LM rescoring, and using v12 of Common Voice instead of v13.
Training and evaluation data
Training and evaluation splits used are the ones available in the Common Voice dataset.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 8e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
4.7284 | 2.1 | 250 | 2.9453 | 1.0 |
1.7496 | 4.2 | 500 | 0.5141 | 0.4771 |
0.8168 | 6.3 | 750 | 0.3220 | 0.3148 |
0.7403 | 8.4 | 1000 | 0.2988 | 0.2573 |
0.7298 | 10.5 | 1250 | 0.2794 | 0.2347 |
0.6303 | 12.61 | 1500 | 0.2577 | 0.2164 |
0.5201 | 14.71 | 1750 | 0.2746 | 0.2162 |
0.5189 | 16.81 | 2000 | 0.2543 | 0.2034 |
0.5054 | 18.91 | 2250 | 0.2847 | 0.2071 |
0.5112 | 21.01 | 2500 | 0.2772 | 0.1979 |
0.5105 | 23.11 | 2750 | 0.2633 | 0.1920 |
0.5032 | 25.21 | 3000 | 0.2667 | 0.1856 |
0.46 | 27.31 | 3250 | 0.2730 | 0.1852 |
0.4992 | 29.41 | 3500 | 0.2626 | 0.1782 |
0.4535 | 31.51 | 3750 | 0.2778 | 0.1749 |
0.4036 | 33.61 | 4000 | 0.2825 | 0.1747 |
0.3347 | 35.71 | 4250 | 0.2797 | 0.1708 |
0.2708 | 37.82 | 4500 | 0.2662 | 0.1712 |
0.1825 | 39.92 | 4750 | 0.2652 | 0.1648 |
0.1654 | 42.02 | 5000 | 0.2719 | 0.1628 |
0.1387 | 44.12 | 5250 | 0.2552 | 0.1607 |
0.1367 | 46.22 | 5500 | 0.2641 | 0.1591 |
0.1218 | 48.32 | 5750 | 0.2634 | 0.1598 |
Framework versions
- Transformers 4.27.3
- Pytorch 2.0.0+cu117
- Datasets 2.10.1
- Tokenizers 0.13.2