|
---
|
|
language: pt
|
|
datasets:
|
|
- CORAA
|
|
- common_voice
|
|
- mls
|
|
- cetuc
|
|
- voxforge
|
|
metrics:
|
|
- wer
|
|
tags:
|
|
- audio
|
|
- speech
|
|
- wav2vec2
|
|
- pt
|
|
- portuguese-speech-corpus
|
|
- automatic-speech-recognition
|
|
- speech
|
|
- PyTorch
|
|
license: apache-2.0
|
|
model-index:
|
|
- name: Alef Iury XLSR Wav2Vec2 Large 53 Portuguese
|
|
results:
|
|
- task:
|
|
name: Speech Recognition
|
|
type: automatic-speech-recognition
|
|
metrics:
|
|
- name: Test CORAA WER
|
|
type: wer
|
|
value: 24.89%
|
|
---
|
|
|
|
# Wav2vec 2.0 trained with CORAA Portuguese Dataset and Open Portuguese Datasets
|
|
|
|
This a the demonstration of a fine-tuned Wav2vec model for Portuguese using the following datasets:
|
|
|
|
- [CORAA dataset](https://github.com/nilc-nlp/CORAA)
|
|
- [CETUC](http://www02.smt.ufrj.br/~igor.quintanilha/alcaim.tar.gz).
|
|
- [Multilingual Librispeech (MLS)](http://www.openslr.org/94/).
|
|
- [VoxForge](http://www.voxforge.org/).
|
|
- [Common Voice 6.1](https://commonvoice.mozilla.org/pt).
|
|
|
|
## Repository
|
|
|
|
The repository that implements the model to be trained and tested is avaible [here](https://github.com/alefiury/SE-R_2022_Challenge_Wav2vec2). |