--- language: - ms - en --- # Malaysian Distil Whisper Large V3 Distil Whisper Large V3 on Malaysian dataset, 1. IMDA STT, https://huggingface.co/datasets/mesolitica/IMDA-STT 2. Pseudolabel Malaysian youtube videos, https://huggingface.co/datasets/mesolitica/pseudolabel-malaysian-youtube-whisper-large-v3 We follow exact distillation process from https://github.com/huggingface/distil-whisper with minor changes. Script at https://github.com/mesolitica/malaya-speech/tree/malaysian-speech/session/distill-whisper Wandb at https://wandb.ai/huseinzol05/distil-whisper?workspace=user-huseinzol05