SakshiRathi77 commited on
Commit
ce1a421
1 Parent(s): 42d081b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -5
README.md CHANGED
@@ -1,11 +1,93 @@
1
  ---
2
  license: apache-2.0
 
 
 
3
  datasets:
4
- - mozilla-foundation/common_voice_13_0
 
5
  language:
6
- - hi
7
  metrics:
8
- - wer
9
- library_name: adapter-transformers
 
10
  pipeline_tag: automatic-speech-recognition
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model: openai/whisper-small
4
+ tags:
5
+ - generated_from_trainer
6
  datasets:
7
+ - mozilla-foundation/common_voice_15_0
8
+ - mozilla-foundation/common_voice_13_0
9
  language:
10
+ - hi
11
  metrics:
12
+ - cer
13
+ - wer
14
+ library_name: transformers
15
  pipeline_tag: automatic-speech-recognition
16
+ model-index:
17
+ - name: whisper-small-hi-cv
18
+ results:
19
+ - task:
20
+ name: Automatic Speech Recognition
21
+ type: automatic-speech-recognition
22
+ dataset:
23
+ name: Common Voice 15
24
+ type: mozilla-foundation/common_voice_15_0
25
+ args: hi
26
+ metrics:
27
+ - name: Test WER
28
+ type: wer
29
+ value: 13.9913
30
+ - name: Test CER
31
+ type: cer
32
+ value: 5.8844
33
+ - task:
34
+ name: Automatic Speech Recognition
35
+ type: automatic-speech-recognition
36
+ dataset:
37
+ name: Common Voice 13
38
+ type: mozilla-foundation/common_voice_13_0
39
+ args: hi
40
+ metrics:
41
+ - name: Test WER
42
+ type: wer
43
+ value: 23.1361
44
+ - name: Test CER
45
+ type: cer
46
+ value: 10.4366
47
+
48
+ ---
49
+ # whisper-small-hi-cv
50
+
51
+ This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the Common Voice 15 dataset.
52
+ It achieves the following results on the evaluation set:
53
+ - Wer: 13.9913
54
+ - Cer: 5.8844
55
+
56
+
57
+
58
+ ## Evaluation
59
+
60
+ ```python
61
+ from datasets import load_dataset,load_metric,Audio
62
+ from transformers import WhisperForConditionalGeneration, WhisperProcessor
63
+ import torch
64
+ import torchaudio
65
+
66
+ test_dataset = load_dataset("mozilla-foundation/common_voice_13_0", "hi", split="test")
67
+ wer = load_metric("wer")
68
+ cer = load_metric("cer")
69
+
70
+ processor = WhisperProcessor.from_pretrained("kingabzpro/whisper-small-hi-cv")
71
+ model = WhisperForConditionalGeneration.from_pretrained("kingabzpro/whisper-small-hi-cv").to("cuda")
72
+ test_dataset = test_dataset.cast_column("audio", Audio(sampling_rate=16000))
73
+
74
+ def map_to_pred(batch):
75
+ audio = batch["audio"]
76
+ input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
77
+ batch["reference"] = processor.tokenizer._normalize(batch['sentence'])
78
+
79
+ with torch.no_grad():
80
+ predicted_ids = model.generate(input_features.to("cuda"))[0]
81
+ transcription = processor.decode(predicted_ids)
82
+ batch["prediction"] = processor.tokenizer._normalize(transcription)
83
+ return batch
84
+
85
+ result = test_dataset.map(map_to_pred)
86
+
87
+ print("WER: {:2f}".format(100 * wer.compute(predictions=result["prediction"], references=result["reference"])))
88
+ print("CER: {:2f}".format(100 * cer.compute(predictions=result["prediction"], references=result["reference"])))
89
+ ```
90
+ ```bash
91
+ WER: 23.1361
92
+ CER: 10.4366
93
+ ```