mms-1b-all-sw-CV_Fleurs_AMMI_ALFFA-100hrs-v1

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Wer: 0.1917
  • Cer: 0.0678

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 80
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Val Wer Val Cer
0.5825 1.0 8023 0.2372 0.0815
0.3237 2.0 16046 0.2313 0.0793
0.3009 3.0 24069 0.2295 0.0779
0.2851 4.0 32092 0.2255 0.0769
0.2762 5.0 40115 0.2241 0.0761
0.2678 6.0 48138 0.2204 0.0752
0.2625 7.0 56161 0.2191 0.0746
0.2577 8.0 64184 0.2155 0.0742
0.2536 9.0 72207 0.2128 0.0735
0.2499 10.0 80230 0.2118 0.0734
0.2461 11.0 88253 0.2113 0.0727
0.2417 12.0 96276 0.2088 0.0722
0.2381 13.0 104299 0.2067 0.0714
0.2355 14.0 112322 0.2055 0.0718
0.232 15.0 120345 0.2035 0.0709
0.2299 16.0 128368 0.2083 0.0739
0.2269 17.0 136391 0.2045 0.0718
0.2234 18.0 144414 0.2006 0.0701
0.2216 19.0 152437 0.2012 0.0699
0.2183 20.0 160460 0.1991 0.0695
0.2169 21.0 168483 0.1986 0.0697
0.2151 22.0 176506 0.1986 0.0705
0.2116 23.0 184529 0.1998 0.0703
0.2101 24.0 192552 0.2012 0.0714
0.2075 25.0 200575 0.1974 0.0693
0.2066 26.0 208598 0.1966 0.0697
0.2052 27.0 216621 0.1985 0.0709
0.2026 28.0 224644 0.1989 0.0702
0.2019 29.0 232667 0.1980 0.0706
0.1999 30.0 240690 0.1978 0.0709
0.1991 31.0 248713 0.1931 0.0678
0.1977 32.0 256736 0.1948 0.0691
0.1949 33.0 264759 0.1925 0.0683
0.1955 34.0 272782 0.1923 0.0678
0.1936 35.0 280805 0.1938 0.0685
0.1926 36.0 288828 0.1931 0.0684
0.1907 37.0 296851 0.1921 0.0680
0.1894 38.0 304874 0.1923 0.0677
0.1888 39.0 312897 0.1908 0.0681
0.187 40.0 320920 0.1931 0.0686
0.1857 41.0 328943 0.1909 0.0677
0.1849 42.0 336966 0.1921 0.0681
0.1837 43.0 344989 0.1907 0.0679
0.1837 44.0 353012 0.1923 0.0682
0.1824 45.0 361035 0.1908 0.0678
0.1817 46.0 369058 0.1905 0.0684
0.1799 47.0 377081 0.1910 0.0680
0.1794 48.0 385104 0.1919 0.0679
0.18 49.0 393127 0.1917 0.0678

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
8
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for asr-africa/mms-1b-all-sw-CV_Fleurs_AMMI_ALFFA-100hrs-v1

Finetuned
(246)
this model

Collection including asr-africa/mms-1b-all-sw-CV_Fleurs_AMMI_ALFFA-100hrs-v1