|
## Model and data descriptions |
|
|
|
This is a wav2vec 2.0 base model trained on the Niger-Mali audio collection and on the Tamasheq-French speech corpus. These combined contained 111 hours of French, 109 hours of Fulfulde, 100 hours of Hausa, 243 hours of Tamasheq and 95 hours of Zarma. |
|
These corpora were presented in [Boito et al., 2022](https://arxiv.org/abs/2201.05051). |
|
|
|
## Intended uses & limitations |
|
|
|
Pretrained wav2vec2 models are distributed under the Apache-2.0 license. Hence, they can be reused extensively without strict limitations. |
|
|
|
## Referencing our IWSLT models |
|
``` |
|
@article{boito2022trac, |
|
title={ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks}, |
|
author={Boito, Marcely Zanon and Ortega, John and Riguidel, Hugo and Laurent, Antoine and Barrault, Lo{\"\i}c and Bougares, Fethi and Chaabani, Firas and Nguyen, Ha and Barbier, Florentin and Gahbiche, Souhir and others}, |
|
journal={IWSLT}, |
|
year={2022} |
|
} |
|
``` |