Malasar ASR Resources
Collection
7 items • Updated • 1
How to use vrclc/Malasar_Medium_DTF with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="vrclc/Malasar_Medium_DTF") # Load model directly
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("vrclc/Malasar_Medium_DTF")
model = AutoModelForSpeechSeq2Seq.from_pretrained("vrclc/Malasar_Medium_DTF")# Load model directly
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("vrclc/Malasar_Medium_DTF")
model = AutoModelForSpeechSeq2Seq.from_pretrained("vrclc/Malasar_Medium_DTF")This model is a fine-tuned version of openai/whisper-medium on the Spoken Bible Corpus: Malasar dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.0646 | 11.3636 | 250 | 0.3369 | 55.6254 |
| 0.0104 | 22.7273 | 500 | 0.4445 | 52.3130 |
| 0.0012 | 34.0909 | 750 | 0.4890 | 50.1428 |
| 0.0002 | 45.4545 | 1000 | 0.5240 | 50.3712 |
| 0.0002 | 56.8182 | 1250 | 0.5488 | 50.1999 |
| 0.0001 | 68.1818 | 1500 | 0.5695 | 50.3712 |
| 0.0001 | 79.5455 | 1750 | 0.5844 | 50.1999 |
| 0.0001 | 90.9091 | 2000 | 0.5907 | 50.2570 |
@misc{multistage2024,
title={Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages},
author={Leena G Pillai, Kavya Manohar, Basil K Raju, Elizabeth Sherly},
year={2024},
eprint={2411.04573},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2411.04573},
}
Base model
openai/whisper-medium
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="vrclc/Malasar_Medium_DTF")