Set a definition for WER
Browse files
README.md
CHANGED
@@ -41,6 +41,8 @@ Please install:
|
|
41 |
|
42 |
We evaluated the model against different Arabic-STT Wav2Vec models.
|
43 |
|
|
|
|
|
44 |
| | Model | [using transliteration](https://pypi.org/project/lang-trans/) | WER | Training Datasets |
|
45 |
|---:|:--------------------------------------|:---------------------|---------:|---------:|
|
46 |
| 1 | bakrianoo/sinai-voice-ar-stt | True | 0.238001 |Common Voice 6|
|
@@ -80,8 +82,8 @@ resamplers = { # all three sampling rates exist in test split
|
|
80 |
transformation = jiwer.Compose([
|
81 |
# normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
|
82 |
jiwer.SubstituteRegexes({
|
83 |
-
r'[auiFNKo
|
84 |
-
r"[
|
85 |
# default transformation below
|
86 |
jiwer.RemoveMultipleSpaces(),
|
87 |
jiwer.Strip(),
|
@@ -274,8 +276,8 @@ test_split = test_split.map(predict, batched=True, batch_size=16, remove_columns
|
|
274 |
transformation = jiwer.Compose([
|
275 |
# normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
|
276 |
jiwer.SubstituteRegexes({
|
277 |
-
r'[auiFNKo
|
278 |
-
r"[
|
279 |
# default transformation below
|
280 |
jiwer.RemoveMultipleSpaces(),
|
281 |
jiwer.Strip(),
|
@@ -293,6 +295,8 @@ print(f"WER: {metrics['wer']:.2%}")
|
|
293 |
```
|
294 |
**Test Result**: 23.80%
|
295 |
|
|
|
|
|
296 |
|
297 |
## Other Arabic Voice recognition Models
|
298 |
|
|
|
41 |
|
42 |
We evaluated the model against different Arabic-STT Wav2Vec models.
|
43 |
|
44 |
+
[**WER**: Word Error Rate] The Lowest score you get, the best model you have
|
45 |
+
|
46 |
| | Model | [using transliteration](https://pypi.org/project/lang-trans/) | WER | Training Datasets |
|
47 |
|---:|:--------------------------------------|:---------------------|---------:|---------:|
|
48 |
| 1 | bakrianoo/sinai-voice-ar-stt | True | 0.238001 |Common Voice 6|
|
|
|
82 |
transformation = jiwer.Compose([
|
83 |
# normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
|
84 |
jiwer.SubstituteRegexes({
|
85 |
+
r'[auiFNKo\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\~_،؟»\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\?;:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.؛«!"]': "", "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\u06D6": "",
|
86 |
+
r"[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\|\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\{]": "A", "p": "h", "ک": "k", "ی": "y"}),
|
87 |
# default transformation below
|
88 |
jiwer.RemoveMultipleSpaces(),
|
89 |
jiwer.Strip(),
|
|
|
276 |
transformation = jiwer.Compose([
|
277 |
# normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
|
278 |
jiwer.SubstituteRegexes({
|
279 |
+
r'[auiFNKo\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\~_،؟»\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\?;:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.؛«!"]': "", "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\u06D6": "",
|
280 |
+
r"[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\|\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\{]": "A", "p": "h", "ک": "k", "ی": "y"}),
|
281 |
# default transformation below
|
282 |
jiwer.RemoveMultipleSpaces(),
|
283 |
jiwer.Strip(),
|
|
|
295 |
```
|
296 |
**Test Result**: 23.80%
|
297 |
|
298 |
+
[**WER**: Word Error Rate] The Lowest score you get, the best model you have
|
299 |
+
|
300 |
|
301 |
## Other Arabic Voice recognition Models
|
302 |
|