Automatic Speech Recognition
PyTorch
allophant
phoneme-recognition
kgnlp commited on
Commit
09c65cf
1 Parent(s): 0629ef9

Updated model information

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md CHANGED
@@ -1,3 +1,40 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - mozilla-foundation/common_voice_10_0
5
+ base_model:
6
+ - facebook/wav2vec2-xls-r-300m
7
+ tags:
8
+ - pytorch
9
+ - phoneme-recognition
10
+ pipeline_tag: automatic-speech-recognition
11
  ---
12
+
13
+ Model Information
14
+ =================
15
+
16
+ Allophant is a multilingual phoneme recognizer trained on spoken sentences in 34 languages, capable of generalizing zero-shot to unseen phoneme inventories.
17
+
18
+ The model is based on [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) and was pre-trained on a subset of the [Common Voice Corpus 10.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_10_0) transcribed with [eSpeak NG](https://github.com/espeak-ng/espeak-ng).
19
+
20
+ | Model Name | UCLA Phonetic Corpus (PER) | UCLA Phonetic Corpus (AER) | Common Voice (PER) | Common Voice (AER) |
21
+ | ---------------- | ---------: | ---------: | -------: | -------: |
22
+ | **Multitask** | **45.62%** | 19.44% | **34.34%** | **8.36%** |
23
+ | [Hierarchical](https://huggingface.co/kgnlp/allophant-hierarchical) | 46.09% | **19.18%** | 34.35% | 8.56% |
24
+ | [Multitask Shared](https://huggingface.co/kgnlp/allophant-shared) | 46.05% | 19.52% | 41.20% | 8.88% |
25
+ | [Baseline Shared](https://huggingface.co/kgnlp/allophant-baseline-shared) | 48.25% | - | 45.35% | - |
26
+ | [Baseline](https://huggingface.co/kgnlp/allophant-baseline) | 57.01% | - | 46.95% | - |
27
+
28
+ Note that our baseline models were trained without phonetic feature classifiers and therefore only support phoneme recognition.
29
+
30
+ Citation
31
+ ========
32
+
33
+ ```bibtex
34
+ @inproceedings{glocker2023allophant,
35
+ title={Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes},
36
+ author={Glocker, Kevin and Herygers, Aaricia and Georges, Munir},
37
+ year={2023},
38
+ booktitle={{Proc. Interspeech 2023}},
39
+ month={8}}
40
+ ```