AliMeeting
This is an ASR recipe for the AliMeeting corpus. AliMeeting provides recordings from the speaker's headset microphones and an 8-channel microphone array. We pool data in the following 4 ways and train a single model on the pooled data:
(i) individual headset microphone (IHM) (ii) IHM with simulated reverb (iii) Single distant microphone (SDM) (iv) GSS-enhanced array microphones
Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled data.
Performance Record
pruned_transducer_stateless7
The following are decoded using modified_beam_search
:
Evaluation set | eval CER | test CER |
---|---|---|
IHM | 9.58 | 11.53 |
SDM | 23.37 | 25.85 |
MDM (GSS-enhanced) | 11.82 | 14.22 |
See the recipe for details.