jethrowang's picture
End of training
9818628 verified
metadata
language:
  - zh
license: apache-2.0
base_model: openai/whisper-tiny
tags:
  - generated_from_trainer
datasets:
  - formospeech/hat_asr_aligned
model-index:
  - name: Whisper Tiny Hakka Simulated Webcam
    results: []

Whisper Tiny Hakka Simulated Webcam

This model is a fine-tuned version of openai/whisper-tiny on the HAT ASR Aligned dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1845
  • Cer: 9.7476

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 976
  • training_steps: 9760
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
0.2444 0.9980 488 0.3550 25.2919
0.1293 1.9959 976 0.3005 32.6282
0.0835 2.9939 1464 0.2549 15.8437
0.0429 3.9918 1952 0.2720 23.6551
0.0279 4.9898 2440 0.2567 17.1440
0.0177 5.9877 2928 0.2180 13.6903
0.0103 6.9857 3416 0.2410 14.9398
0.007 7.9836 3904 0.2333 12.8846
0.0052 8.9816 4392 0.2008 12.4708
0.004 9.9796 4880 0.2393 12.1957
0.0023 10.9775 5368 0.2057 12.9829
0.0021 11.9755 5856 0.2098 12.4096
0.0014 12.9734 6344 0.1898 14.2208
0.001 13.9714 6832 0.2006 10.6954
0.0012 14.9693 7320 0.1941 13.2117
0.0005 15.9673 7808 0.1864 11.2548
0.0005 16.9652 8296 0.1887 10.7012
0.0001 17.9632 8784 0.1830 10.1197
0.0001 18.9611 9272 0.1863 9.8146
0.0001 19.9591 9760 0.1845 9.7476

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1