wav2vec2-large-xls-r-300m-frisian-cv-8

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice_8_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0707
  • Wer: 0.0724

And on the test set:

  • Wer: 0.0710

Model description

This model has been developed for my Master's thesis in "Voice Technology" at Rijksuniversiteit Groningen - Campus Fryslân. It corresponds to experiment 6 where I use as training set all validated data (~ 50 hours) except the test and evaluation sets (~ 4.5 hours each). The number of training hours adds up to 41 hours of Frisian speech. This varies from experiment 2 because I fine-tune on the 300M/0.3B parameters version of XLS-R.

Intended uses & limitations

The intended use is for recognizing Frisian speech.

Limitations include no LM rescoring and using version 8.0 of Common Voice instead of 13.0.

Training and evaluation data

The evaluation split used is the one available in the Common Voice 8.0 Frisian subset. The train split corresponds to all of the validated data except for the recordings found in the evaluation and test splits.

Training procedure

The script used for training this model can be found in this GitHub repository: link.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
14.7268 0.43 400 8.7389 1.0
5.3377 0.86 800 3.7016 1.0
3.343 1.29 1200 3.0984 1.0
3.0306 1.71 1600 2.9643 1.0
2.9511 2.14 2000 2.9273 1.0
2.9078 2.57 2400 2.8202 1.0
2.4965 3.0 2800 1.3805 0.8888
1.5378 3.43 3200 0.6556 0.5720
1.119 3.86 3600 0.4260 0.4077
0.9159 4.29 4000 0.3457 0.3322
0.8037 4.72 4400 0.2765 0.2850
0.7411 5.14 4800 0.2447 0.2473
0.6767 5.57 5200 0.2176 0.2234
0.6296 6.0 5600 0.1996 0.2078
0.6165 6.43 6000 0.1891 0.1977
0.5856 6.86 6400 0.1763 0.1855
0.5674 7.29 6800 0.1708 0.1797
0.5399 7.72 7200 0.1593 0.1694
0.5195 8.15 7600 0.1551 0.1660
0.4973 8.57 8000 0.1509 0.1583
0.4907 9.0 8400 0.1480 0.1525
0.4681 9.43 8800 0.1389 0.1494
0.4513 9.86 9200 0.1368 0.1414
0.4486 10.29 9600 0.1294 0.1390
0.4381 10.72 10000 0.1262 0.1354
0.443 11.15 10400 0.1234 0.1313
0.4182 11.58 10800 0.1196 0.1294
0.4036 12.0 11200 0.1194 0.1259
0.4027 12.43 11600 0.1170 0.1226
0.4066 12.86 12000 0.1156 0.1224
0.3885 13.29 12400 0.1136 0.1174
0.3859 13.72 12800 0.1121 0.1146
0.3812 14.15 13200 0.1097 0.1141
0.3774 14.58 13600 0.1059 0.1130
0.3678 15.01 14000 0.1058 0.1096
0.3586 15.43 14400 0.1026 0.1099
0.3612 15.86 14800 0.1010 0.1076
0.3626 16.29 15200 0.0993 0.1068
0.353 16.72 15600 0.0974 0.1046
0.3564 17.15 16000 0.0986 0.1037
0.3447 17.58 16400 0.0977 0.1041
0.3454 18.01 16800 0.0945 0.1023
0.3338 18.44 17200 0.0904 0.0996
0.3359 18.86 17600 0.0950 0.1002
0.3179 19.29 18000 0.0911 0.0977
0.3202 19.72 18400 0.0906 0.0979
0.3317 20.15 18800 0.0894 0.0963
0.3187 20.58 19200 0.0878 0.0938
0.3075 21.01 19600 0.0893 0.0937
0.3032 21.44 20000 0.0872 0.0923
0.3048 21.86 20400 0.0848 0.0921
0.3045 22.29 20800 0.0860 0.0887
0.316 22.72 21200 0.0841 0.0896
0.2986 23.15 21600 0.0840 0.0876
0.294 23.58 22000 0.0824 0.0862
0.313 24.01 22400 0.0814 0.0855
0.2864 24.44 22800 0.0816 0.0861
0.2927 24.87 23200 0.0807 0.0875
0.294 25.29 23600 0.0829 0.0826
0.2834 25.72 24000 0.0794 0.0823
0.2852 26.15 24400 0.0781 0.0815
0.2823 26.58 24800 0.0781 0.0821
0.2835 27.01 25200 0.0788 0.0826
0.2763 27.44 25600 0.0789 0.0823
0.2845 27.87 26000 0.0767 0.0803
0.2777 28.3 26400 0.0775 0.0809
0.275 28.72 26800 0.0758 0.0794
0.2707 29.15 27200 0.0745 0.0790
0.2734 29.58 27600 0.0765 0.0797
0.2716 30.01 28000 0.0746 0.0780
0.2626 30.44 28400 0.0756 0.0776
0.2671 30.87 28800 0.0742 0.0763
0.2592 31.3 29200 0.0730 0.0771
0.2685 31.73 29600 0.0733 0.0760
0.2727 32.15 30000 0.0738 0.0758
0.2564 32.58 30400 0.0731 0.0763
0.2528 33.01 30800 0.0730 0.0758
0.2573 33.44 31200 0.0717 0.0746
0.2597 33.87 31600 0.0718 0.0760
0.2511 34.3 32000 0.0737 0.0750
0.2551 34.73 32400 0.0732 0.0758
0.26 35.16 32800 0.0724 0.0746
0.2563 35.58 33200 0.0717 0.0730
0.2559 36.01 33600 0.0707 0.0734
0.2499 36.44 34000 0.0721 0.0729
0.252 36.87 34400 0.0716 0.0723
0.2448 37.3 34800 0.0711 0.0725
0.248 37.73 35200 0.0710 0.0727
0.2568 38.16 35600 0.0710 0.0720
0.2471 38.59 36000 0.0707 0.0725
0.2464 39.01 36400 0.0705 0.0719
0.2477 39.44 36800 0.0706 0.0727
0.2482 39.87 37200 0.0707 0.0724

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu117
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results