Model Description
Pre-Training Settings:
166k samples from Common Voice 13.0 was recognized by Whisper tiny.en.
1,000 random samples was selected as the test set, and the rest for training and validation with an 80%-20% split
Batch size: 256
Initial learning rate: 1e-5
Adam optimizer
30 epochs
Cross-entropy loss
Best checkpoint saved based on WER as the evaluation metric
Decoding is performed using beam search with a size of 5
S2S backbone model adopted from ''Exploring data augmentation for code generation tasks''.
Continue-Training Setting:
- 2 epochs for gold-gold to prevent the over-correction problem on ''Ted talk data''
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.