speechbrain
/

asr-crdnn-commonvoice-fr

@@ -25,7 +25,7 @@ recognition from an end-to-end system pretrained on CommonVoice (French Language
 SpeechBrain. For a better experience, we encourage you to learn more about
 [SpeechBrain](https://speechbrain.github.io).
-The performance of the model is the following.
 | Release | Test CER | Test WER | GPUs |
 |:-------------:|:--------------:|:--------------:| :--------:|
@@ -34,9 +34,9 @@ The performance of the model is the following.
 ## Pipeline description
 This ASR system is composed of 2 different but linked blocks:
-1. Tokenizer (unigram) that transforms words into subword units and trained with
 the train transcriptions (train.tsv) of CommonVoice (FR).
-2. Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
 N blocks of convolutional neural networks with normalization and pooling on the
 frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
 the final acoustic representation that is given to the CTC and attention decoders.
@@ -78,7 +78,7 @@ The SpeechBrain team does not provide any warranty on the performance achieved b
     year = {2021},
     publisher = {GitHub},
     journal = {GitHub repository},
-    howpublished = {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\url{https://github.com/speechbrain/speechbrain}},
   }
 ```

 SpeechBrain. For a better experience, we encourage you to learn more about
 [SpeechBrain](https://speechbrain.github.io).
+The performance of the model is the following:
 | Release | Test CER | Test WER | GPUs |
 |:-------------:|:--------------:|:--------------:| :--------:|
 ## Pipeline description
 This ASR system is composed of 2 different but linked blocks:
+1-Tokenizer (unigram) that transforms words into subword units and trained with
 the train transcriptions (train.tsv) of CommonVoice (FR).
+2- Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
 N blocks of convolutional neural networks with normalization and pooling on the
 frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
 the final acoustic representation that is given to the CTC and attention decoders.
     year = {2021},
     publisher = {GitHub},
     journal = {GitHub repository},
+    howpublished = {\url{https://github.com/speechbrain/speechbrain}},
   }
 ```