Model Description

Pre-Training Settings:

166k samples from Common Voice 13.0 was recognized by Whisper tiny.en.

1,000 random samples was selected as the test set, and the rest for training and validation with an 80%-20% split

  • Batch size: 256

  • Initial learning rate: 1e-5

  • Adam optimizer

  • 30 epochs

  • Cross-entropy loss

  • Best checkpoint saved based on WER as the evaluation metric

  • Decoding is performed using beam search with a size of 5

  • S2S backbone model adopted from ''Exploring data augmentation for code generation tasks''.

Continue-Training Setting:

  • 2 epochs for gold-gold to prevent the over-correction problem on ''Ted talk data''
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.