poonehmousavi commited on
Commit
4add3fa
1 Parent(s): 68ec074

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -16
README.md CHANGED
@@ -1,6 +1,4 @@
1
  ---
2
- language:
3
- - ar
4
  thumbnail: null
5
  pipeline_tag: automatic-speech-recognition
6
  tags:
@@ -16,31 +14,31 @@ metrics:
16
  - wer
17
  - cer
18
  model-index:
19
- - name: asr-whisper-medium-commonvoice-ar
20
  results:
21
  - task:
22
  name: Automatic Speech Recognition
23
  type: automatic-speech-recognition
24
  dataset:
25
- name: CommonVoice 10.0 (Arabic)
26
  type: mozilla-foundation/common_voice_14_0
27
- config: ar
28
  split: test
29
  args:
30
- language: ar
31
  metrics:
32
  - name: Test WER
33
  type: wer
34
- value: '14.82'
35
  ---
36
 
37
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=medium" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
38
  <br/><br/>
39
 
40
- # whisper medium fine-tuned on CommonVoice-14.0 Arabic
41
 
42
  This repository provides all the necessary tools to perform automatic speech
43
- recognition from an end-to-end whisper model fine-tuned on CommonVoice (Arabic Language) within
44
  SpeechBrain. For a better experience, we encourage you to learn more about
45
  [SpeechBrain](https://speechbrain.github.io).
46
 
@@ -48,14 +46,14 @@ The performance of the model is the following:
48
 
49
  | Release | Test CER | Test WER | GPUs |
50
  |:-------------:|:--------------:|:--------------:| :--------:|
51
- | 1-08-23 | 4.95 | 14.82 | 1xV100 32GB |
52
 
53
  ## Pipeline description
54
 
55
  This ASR system is composed of whisper encoder-decoder blocks:
56
  - The pretrained whisper-medium encoder is frozen.
57
  - The pretrained Whisper tokenizer is used.
58
- - A pretrained Whisper-medium decoder ([openai/whisper-medium](https://huggingface.co/openai/whisper-medium)) is finetuned on CommonVoice ar.
59
  The obtained final acoustic representation is given to the greedy decoder.
60
 
61
  The system is trained with recordings sampled at 16kHz (single channel).
@@ -72,14 +70,14 @@ pip install speechbrain transformers
72
  Please notice that we encourage you to read our tutorials and learn more about
73
  [SpeechBrain](https://speechbrain.github.io).
74
 
75
- ### Transcribing your own audio files (in Arabic)
76
 
77
  ```python
78
 
79
  from speechbrain.pretrained import WhisperASR
80
 
81
- asr_model = WhisperASR.from_hparams(source="speechbrain/asr-whisper-medium-commonvoice-ar", savedir="pretrained_models/asr-whisper-medium-commonvoice-ar")
82
- asr_model.transcribe_file("speechbrain/asr-whisper-lmedium-commonvoice-ar/example-ar.mp3")
83
 
84
 
85
  ```
@@ -103,7 +101,7 @@ pip install -e .
103
  3. Run Training:
104
  ```bash
105
  cd recipes/CommonVoice/ASR/transformer/
106
- python train_with_whisper.py hparams/train_ar_hf_whisper.yaml --data_folder=your_data_folder
107
  ```
108
 
109
  You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/11PKCsyIE703mmDv6n6n_UnD0bUgMPbg_?usp=share_link).
@@ -129,4 +127,4 @@ SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to b
129
 
130
  Website: https://speechbrain.github.io/
131
 
132
- GitHub: https://github.com/speechbrain/speechbrain
 
1
  ---
 
 
2
  thumbnail: null
3
  pipeline_tag: automatic-speech-recognition
4
  tags:
 
14
  - wer
15
  - cer
16
  model-index:
17
+ - name: asr-whisper-medium-commonvoice-sr
18
  results:
19
  - task:
20
  name: Automatic Speech Recognition
21
  type: automatic-speech-recognition
22
  dataset:
23
+ name: CommonVoice 10.0 (Serbian)
24
  type: mozilla-foundation/common_voice_14_0
25
+ config: sr
26
  split: test
27
  args:
28
+ language: sr
29
  metrics:
30
  - name: Test WER
31
  type: wer
32
+ value: '25.10'
33
  ---
34
 
35
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=medium" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
36
  <br/><br/>
37
 
38
+ # whisper medium fine-tuned on CommonVoice-14.0 Serbian
39
 
40
  This repository provides all the necessary tools to perform automatic speech
41
+ recognition from an end-to-end whisper model fine-tuned on CommonVoice (Serbian Language) within
42
  SpeechBrain. For a better experience, we encourage you to learn more about
43
  [SpeechBrain](https://speechbrain.github.io).
44
 
 
46
 
47
  | Release | Test CER | Test WER | GPUs |
48
  |:-------------:|:--------------:|:--------------:| :--------:|
49
+ | 1-08-23 | 8.63 | 25.10 | 1xV100 32GB |
50
 
51
  ## Pipeline description
52
 
53
  This ASR system is composed of whisper encoder-decoder blocks:
54
  - The pretrained whisper-medium encoder is frozen.
55
  - The pretrained Whisper tokenizer is used.
56
+ - A pretrained Whisper-medium decoder ([openai/whisper-medium](https://huggingface.co/openai/whisper-medium)) is finetuned on CommonVoice sr.
57
  The obtained final acoustic representation is given to the greedy decoder.
58
 
59
  The system is trained with recordings sampled at 16kHz (single channel).
 
70
  Please notice that we encourage you to read our tutorials and learn more about
71
  [SpeechBrain](https://speechbrain.github.io).
72
 
73
+ ### Transcribing your own audio files (in Serbian)
74
 
75
  ```python
76
 
77
  from speechbrain.pretrained import WhisperASR
78
 
79
+ asr_model = WhisperASR.from_hparams(source="speechbrain/asr-whisper-medium-commonvoice-sr", savedir="pretrained_models/asr-whisper-medium-commonvoice-sr")
80
+ asr_model.transcribe_file("speechbrain/asr-whisper-lmedium-commonvoice-sr/example-sr.mp3")
81
 
82
 
83
  ```
 
101
  3. Run Training:
102
  ```bash
103
  cd recipes/CommonVoice/ASR/transformer/
104
+ python train_with_whisper.py hparams/sr.yaml --data_folder=your_data_folder
105
  ```
106
 
107
  You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/11PKCsyIE703mmDv6n6n_UnD0bUgMPbg_?usp=share_link).
 
127
 
128
  Website: https://speechbrain.github.io/
129
 
130
+ GitHub: https://github.com/speechbrain/speechbrain