tuanio's picture
End of training
20deec1
metadata
license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
  - generated_from_trainer
datasets:
  - common_voice_11_0
metrics:
  - wer
model-index:
  - name: wav2vec2-large-xls-r-300m-cv_vi
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_11_0
          type: common_voice_11_0
          config: vi
          split: test
          args: vi
        metrics:
          - name: Wer
            type: wer
            value: 0.663156740155753

wav2vec2-large-xls-r-300m-cv_vi

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice_11_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3858
  • Wer: 0.6632

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 500

Training results

Training Loss Epoch Step Validation Loss Wer
14.1667 9.2 200 4.5633 1.0
3.6334 18.39 400 3.4332 1.0
1.938 27.59 600 1.2434 0.7082
0.3082 36.78 800 1.2288 0.6534
0.1766 45.98 1000 1.2915 0.6500
0.1287 55.17 1200 1.3452 0.6269
0.1043 64.37 1400 1.4746 0.6395
0.0834 73.56 1600 1.4731 0.6347
0.0837 82.76 1800 1.5893 0.6493
0.0711 91.95 2000 1.6205 0.6522
0.0672 101.15 2200 1.5513 0.6503
0.0745 110.34 2400 1.6509 0.6774
0.07 119.54 2600 1.6779 0.6543
0.0492 128.74 2800 1.7616 0.6611
0.0473 137.93 3000 1.7885 0.6634
0.0535 147.13 3200 1.8877 0.6806
0.0468 156.32 3400 1.7766 0.6671
0.0386 165.52 3600 1.7956 0.6494
0.0418 174.71 3800 1.9402 0.6851
0.0426 183.91 4000 1.9777 0.6927
0.0395 193.1 4200 1.8733 0.6689
0.0392 202.3 4400 1.8994 0.6774
0.0377 211.49 4600 1.9983 0.6889
0.0354 220.69 4800 1.8858 0.6645
0.0315 229.89 5000 1.9716 0.6805
0.0312 239.08 5200 2.0422 0.6825
0.0292 248.28 5400 2.0780 0.7019
0.0283 257.47 5600 1.9102 0.6743
0.025 266.67 5800 1.9745 0.6756
0.0246 275.86 6000 2.1289 0.6918
0.0234 285.06 6200 2.1775 0.7068
0.0219 294.25 6400 2.1755 0.6935
0.0182 303.45 6600 2.1602 0.6764
0.0174 312.64 6800 2.1359 0.6596
0.0157 321.84 7000 2.1958 0.6797
0.0147 331.03 7200 2.1460 0.6657
0.0135 340.23 7400 2.2716 0.6719
0.0124 349.43 7600 2.3556 0.6762
0.0109 358.62 7800 2.2520 0.6632
0.0115 367.82 8000 2.3112 0.6802
0.0108 377.01 8200 2.2925 0.6659
0.0106 386.21 8400 2.2950 0.6726
0.0088 395.4 8600 2.3078 0.6735
0.0084 404.6 8800 2.3538 0.6723
0.0079 413.79 9000 2.3212 0.6615
0.0074 422.99 9200 2.3908 0.6774
0.0094 432.18 9400 2.3164 0.6779
0.0077 441.38 9600 2.3105 0.6649
0.0066 450.57 9800 2.3599 0.6742
0.007 459.77 10000 2.3675 0.6709
0.0056 468.97 10200 2.3964 0.6677
0.0049 478.16 10400 2.3858 0.6632

Framework versions

  • Transformers 4.33.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3