jethrowang's picture
End of training
e8879d0 verified
metadata
language:
  - zh
license: apache-2.0
base_model: openai/whisper-tiny
tags:
  - generated_from_trainer
datasets:
  - formospeech/hat_asr_aligned
model-index:
  - name: Whisper Tiny Hakka Simulated Webcam
    results: []

Whisper Tiny Hakka Simulated Webcam

This model is a fine-tuned version of openai/whisper-tiny on the HAT ASR Aligned dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1323
  • Cer: 7.3156

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 976
  • training_steps: 9760
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
0.2495 0.9980 488 0.3384 23.3014
0.1352 1.9959 976 0.2472 16.6320
0.0854 2.9939 1464 0.2133 16.6389
0.0447 3.9918 1952 0.1956 22.4831
0.0282 4.9898 2440 0.1921 12.4096
0.0167 5.9877 2928 0.1670 10.9682
0.0115 6.9857 3416 0.1833 10.5590
0.0078 7.9836 3904 0.1591 9.6204
0.0057 8.9816 4392 0.1568 10.5324
0.0037 9.9796 4880 0.1684 10.1371
0.0036 10.9775 5368 0.1626 10.8352
0.0015 11.9755 5856 0.1451 10.4226
0.0016 12.9734 6344 0.1562 10.2099
0.0007 13.9714 6832 0.1575 8.7731
0.0004 14.9693 7320 0.1395 9.8597
0.0006 15.9673 7808 0.1421 8.3316
0.0003 16.9652 8296 0.1345 7.5433
0.0001 17.9632 8784 0.1322 7.8692
0.0001 18.9611 9272 0.1326 7.3630
0.0001 19.9591 9760 0.1323 7.3156

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1