metadata

language:
  - dv
license: apache-2.0
tags:
  - automatic-speech-recognition
  - mozilla-foundation/common_voice_8_0
  - generated_from_trainer
  - robust-speech-event
datasets:
  - common_voice
model-index:
  - name: XLS-R-300M - Dhivehi- CV8
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 8
          type: mozilla-foundation/common_voice_8_0
          args: dv
        metrics:
          - name: Test WER
            type: wer
            value: 29.69
          - name: Test CER
            type: cer
            value: 5.48

xls-r-300m-dv

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset. It achieves the following results on the evaluation set:

Loss: 0.3149
Wer: 0.2947

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.9617	0.66	400	1.4251	0.9768
0.9081	1.33	800	0.6068	0.7290
0.6575	1.99	1200	0.4700	0.6234
0.548	2.65	1600	0.4158	0.5868
0.5031	3.32	2000	0.4067	0.5728
0.4792	3.98	2400	0.3965	0.5673
0.4344	4.64	2800	0.3862	0.5383
0.4237	5.31	3200	0.3794	0.5316
0.3984	5.97	3600	0.3395	0.5177
0.3788	6.63	4000	0.3528	0.5329
0.3685	7.3	4400	0.3404	0.5060
0.3535	7.96	4800	0.3425	0.5069
0.3391	8.62	5200	0.3576	0.5118
0.331	9.29	5600	0.3259	0.4783
0.3192	9.95	6000	0.3145	0.4794
0.2956	10.61	6400	0.3111	0.4650
0.2936	11.28	6800	0.3303	0.4741
0.2868	11.94	7200	0.3109	0.4597
0.2743	12.6	7600	0.3191	0.4557
0.2654	13.27	8000	0.3286	0.4570
0.2556	13.93	8400	0.3186	0.4468
0.2452	14.59	8800	0.3405	0.4582
0.241	15.26	9200	0.3418	0.4533
0.2313	15.92	9600	0.3388	0.4405
0.2234	16.58	10000	0.3659	0.4421
0.2194	17.25	10400	0.3559	0.4490
0.2168	17.91	10800	0.3452	0.4355
0.2036	18.57	11200	0.3496	0.4259
0.2046	19.24	11600	0.3282	0.4245
0.1917	19.9	12000	0.3201	0.4052
0.1908	20.56	12400	0.3439	0.4165
0.1838	21.23	12800	0.3165	0.3950
0.1828	21.89	13200	0.3332	0.4079
0.1774	22.55	13600	0.3485	0.4072
0.1776	23.22	14000	0.3308	0.3868
0.1693	23.88	14400	0.3153	0.3906
0.1656	24.54	14800	0.3408	0.3899
0.1629	25.21	15200	0.3333	0.3854
0.164	25.87	15600	0.3172	0.3775
0.1505	26.53	16000	0.3105	0.3777
0.1524	27.2	16400	0.3136	0.3726
0.1482	27.86	16800	0.3110	0.3710
0.1423	28.52	17200	0.3299	0.3687
0.1419	29.19	17600	0.3271	0.3645
0.135	29.85	18000	0.3333	0.3638
0.1319	30.51	18400	0.3272	0.3640
0.131	31.18	18800	0.3438	0.3636
0.1252	31.84	19200	0.3266	0.3557
0.1238	32.5	19600	0.3195	0.3516
0.1203	33.17	20000	0.3405	0.3534
0.1159	33.83	20400	0.3287	0.3509
0.115	34.49	20800	0.3474	0.3433
0.108	35.16	21200	0.3245	0.3381
0.1091	35.82	21600	0.3185	0.3448
0.1043	36.48	22000	0.3309	0.3363
0.1034	37.15	22400	0.3288	0.3349
0.1015	37.81	22800	0.3222	0.3284
0.0953	38.47	23200	0.3272	0.3315
0.0966	39.14	23600	0.3196	0.3239
0.0938	39.8	24000	0.3199	0.3280
0.0905	40.46	24400	0.3193	0.3166
0.0893	41.13	24800	0.3224	0.3222
0.0858	41.79	25200	0.3216	0.3142
0.0839	42.45	25600	0.3241	0.3135
0.0819	43.12	26000	0.3260	0.3071
0.0782	43.78	26400	0.3202	0.3075
0.0775	44.44	26800	0.3140	0.3067
0.0751	45.11	27200	0.3118	0.3020
0.0736	45.77	27600	0.3155	0.2976
0.071	46.43	28000	0.3105	0.2998
0.0715	47.1	28400	0.3065	0.2993
0.0668	47.76	28800	0.3161	0.2972
0.0698	48.42	29200	0.3137	0.2967
0.0681	49.09	29600	0.3130	0.2971
0.0651	49.75	30000	0.3149	0.2947

Framework versions

Transformers 4.16.0.dev0
Pytorch 1.10.1+cu102
Datasets 1.17.1.dev0
Tokenizers 0.11.0