Aryan-401
/

whisper-tiny-finetune-hindi-fleurs

@@ -23,11 +23,10 @@ model-index:
     - name: Wer
       type: wer
       value: 0.42621638924455824
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # whisper-tiny-finetune-hindi-fleurs
 This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the google/fleurs dataset.
@@ -36,17 +35,19 @@ It achieves the following results on the evaluation set:
 - Wer Ortho: 0.4313
 - Wer: 0.4262
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -80,3 +81,33 @@ The following hyperparameters were used during training:
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.0
 - Tokenizers 0.15.0

     - name: Wer
       type: wer
       value: 0.42621638924455824
+language:
+- hi
 ---
 # whisper-tiny-finetune-hindi-fleurs
 This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the google/fleurs dataset.
 - Wer Ortho: 0.4313
 - Wer: 0.4262
+A working Hugging Face Space can be found [here](https://huggingface.co/spaces/Aryan-401/whisper-tiny-finetune-hindi)
 ## Model description
+This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the google/fleurs dataset. It improves the WER from 102.3 as stated in the [Whisper Paper](https://cdn.openai.com/papers/whisper.pdf) to 0.42 on the Hindi Subset of google/fleurs
 ## Intended uses & limitations
+This model is intended to be used on Edge Low Compute Devices such as the Raspbery Pi Pico/3/3B/4 and offers real time transcription of Hindi audio into the English Lexicon.
 ## Training and evaluation data
+The model was trained on `google/fleurs`'s `hi_in` Subset and used WER as the evaluation criteria
 ## Training procedure
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.0
 - Tokenizers 0.15.0
+## Citations
+@inproceedings{Bhat:2014:ISS:2824864.2824872,
+ author = {Bhat, Irshad Ahmad and Mujadia, Vandan and Tammewar, Aniruddha and Bhat, Riyaz Ahmad and Shrivastava, Manish},
+ title = {IIIT-H System Submission for FIRE2014 Shared Task on Transliterated Search},
+ booktitle = {Proceedings of the Forum for Information Retrieval Evaluation},
+ series = {FIRE '14},
+ year = {2015},
+ isbn = {978-1-4503-3755-7},
+ location = {Bangalore, India},
+ pages = {48--53},
+ numpages = {6},
+ url = {http://doi.acm.org/10.1145/2824864.2824872},
+ doi = {10.1145/2824864.2824872},
+ acmid = {2824872},
+ publisher = {ACM},
+ address = {New York, NY, USA},
+ keywords = {Information Retrieval, Language Identification, Language Modeling, Perplexity, Transliteration},
+}
+@misc{radford2022whisper,
+  doi = {10.48550/ARXIV.2212.04356},
+  url = {https://arxiv.org/abs/2212.04356},
+  author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
+  title = {Robust Speech Recognition via Large-Scale Weak Supervision},
+  publisher = {arXiv},
+  year = {2022},
+  copyright = {arXiv.org perpetual, non-exclusive license}
+}