Update README.md
Browse files
README.md
CHANGED
@@ -23,11 +23,10 @@ model-index:
|
|
23 |
- name: Wer
|
24 |
type: wer
|
25 |
value: 0.42621638924455824
|
|
|
|
|
26 |
---
|
27 |
|
28 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
29 |
-
should probably proofread and complete it, then remove this comment. -->
|
30 |
-
|
31 |
# whisper-tiny-finetune-hindi-fleurs
|
32 |
|
33 |
This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the google/fleurs dataset.
|
@@ -36,17 +35,19 @@ It achieves the following results on the evaluation set:
|
|
36 |
- Wer Ortho: 0.4313
|
37 |
- Wer: 0.4262
|
38 |
|
|
|
|
|
39 |
## Model description
|
40 |
|
41 |
-
|
42 |
|
43 |
## Intended uses & limitations
|
44 |
|
45 |
-
|
46 |
|
47 |
## Training and evaluation data
|
48 |
|
49 |
-
|
50 |
|
51 |
## Training procedure
|
52 |
|
@@ -80,3 +81,33 @@ The following hyperparameters were used during training:
|
|
80 |
- Pytorch 2.1.0+cu121
|
81 |
- Datasets 2.16.0
|
82 |
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
- name: Wer
|
24 |
type: wer
|
25 |
value: 0.42621638924455824
|
26 |
+
language:
|
27 |
+
- hi
|
28 |
---
|
29 |
|
|
|
|
|
|
|
30 |
# whisper-tiny-finetune-hindi-fleurs
|
31 |
|
32 |
This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the google/fleurs dataset.
|
|
|
35 |
- Wer Ortho: 0.4313
|
36 |
- Wer: 0.4262
|
37 |
|
38 |
+
A working Hugging Face Space can be found [here](https://huggingface.co/spaces/Aryan-401/whisper-tiny-finetune-hindi)
|
39 |
+
|
40 |
## Model description
|
41 |
|
42 |
+
This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the google/fleurs dataset. It improves the WER from 102.3 as stated in the [Whisper Paper](https://cdn.openai.com/papers/whisper.pdf) to 0.42 on the Hindi Subset of google/fleurs
|
43 |
|
44 |
## Intended uses & limitations
|
45 |
|
46 |
+
This model is intended to be used on Edge Low Compute Devices such as the Raspbery Pi Pico/3/3B/4 and offers real time transcription of Hindi audio into the English Lexicon.
|
47 |
|
48 |
## Training and evaluation data
|
49 |
|
50 |
+
The model was trained on `google/fleurs`'s `hi_in` Subset and used WER as the evaluation criteria
|
51 |
|
52 |
## Training procedure
|
53 |
|
|
|
81 |
- Pytorch 2.1.0+cu121
|
82 |
- Datasets 2.16.0
|
83 |
- Tokenizers 0.15.0
|
84 |
+
|
85 |
+
## Citations
|
86 |
+
|
87 |
+
@inproceedings{Bhat:2014:ISS:2824864.2824872,
|
88 |
+
author = {Bhat, Irshad Ahmad and Mujadia, Vandan and Tammewar, Aniruddha and Bhat, Riyaz Ahmad and Shrivastava, Manish},
|
89 |
+
title = {IIIT-H System Submission for FIRE2014 Shared Task on Transliterated Search},
|
90 |
+
booktitle = {Proceedings of the Forum for Information Retrieval Evaluation},
|
91 |
+
series = {FIRE '14},
|
92 |
+
year = {2015},
|
93 |
+
isbn = {978-1-4503-3755-7},
|
94 |
+
location = {Bangalore, India},
|
95 |
+
pages = {48--53},
|
96 |
+
numpages = {6},
|
97 |
+
url = {http://doi.acm.org/10.1145/2824864.2824872},
|
98 |
+
doi = {10.1145/2824864.2824872},
|
99 |
+
acmid = {2824872},
|
100 |
+
publisher = {ACM},
|
101 |
+
address = {New York, NY, USA},
|
102 |
+
keywords = {Information Retrieval, Language Identification, Language Modeling, Perplexity, Transliteration},
|
103 |
+
}
|
104 |
+
|
105 |
+
@misc{radford2022whisper,
|
106 |
+
doi = {10.48550/ARXIV.2212.04356},
|
107 |
+
url = {https://arxiv.org/abs/2212.04356},
|
108 |
+
author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
|
109 |
+
title = {Robust Speech Recognition via Large-Scale Weak Supervision},
|
110 |
+
publisher = {arXiv},
|
111 |
+
year = {2022},
|
112 |
+
copyright = {arXiv.org perpetual, non-exclusive license}
|
113 |
+
}
|