csikasote's picture
Model save
bd7e5f5 verified
|
raw
history blame
3.07 kB
metadata
library_name: transformers
license: cc-by-nc-4.0
base_model: facebook/mms-1b-all
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: mms-1b-bemgen-male-model-test
    results: []

mms-1b-bemgen-male-model-test

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3060
  • Wer: 0.4447

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
6.9809 0.1034 100 1.3139 0.9957
0.745 0.2068 200 0.4297 0.5882
0.5423 0.3102 300 0.3886 0.5644
0.539 0.4137 400 0.3683 0.5448
0.5277 0.5171 500 0.3529 0.5083
0.4708 0.6205 600 0.3493 0.4997
0.4889 0.7239 700 0.3467 0.5096
0.4793 0.8273 800 0.3407 0.4818
0.469 0.9307 900 0.3455 0.4958
0.4407 1.0341 1000 0.3329 0.4735
0.4524 1.1375 1100 0.3289 0.4879
0.4416 1.2410 1200 0.3280 0.4911
0.4599 1.3444 1300 0.3285 0.4765
0.4739 1.4478 1400 0.3221 0.4694
0.4466 1.5512 1500 0.3196 0.4588
0.4483 1.6546 1600 0.3144 0.4526
0.4543 1.7580 1700 0.3170 0.4528
0.4537 1.8614 1800 0.3141 0.4522
0.4293 1.9648 1900 0.3106 0.4453
0.4457 2.0683 2000 0.3134 0.4651
0.4214 2.1717 2100 0.3119 0.4543
0.4103 2.2751 2200 0.3089 0.4391
0.407 2.3785 2300 0.3053 0.4331
0.4314 2.4819 2400 0.3059 0.4337
0.4144 2.5853 2500 0.3054 0.4382
0.4099 2.6887 2600 0.3060 0.4447

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0