bengaliAI
/

BanglaConformer

Automatic Speech Recognition

Model card Files Files and versions Community

appledora commited on Jul 7, 2023

Commit

589f132

•

1 Parent(s): 13d73b2

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -40,10 +40,16 @@ if  not os.path.exists("<RESAMPLED AUDIO FILE PATH>"):
 	tfm.build(input_filepath= "<AUDIO FILE PATH>", output_filepath= "<RESAMPLED AUDIO FILE PATH>")
 ```
 ## Training
-We used the official [NeMo documentation on training an ASR model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/examples/kinyarwanda_asr.html) to prepare our transcript manifest and train our model. However, we did not train any custom tokenizer and instead downloaded the tokenizer from [banglaBERT-large](https://huggingface.co/csebuetnlp/banglabert_large/) for better vocabulary coverage.  For validation, we have used `29589` samples separated from the training data and processed accordingly. The final  validation score was `22.4% WER` , at epoch `164`.
 Training script : [training.sh](training.sh)
 ## Evaluation
 `14,016` test samples have been used to evaluate the dataset. The generated output file contains both ground truth and predicted strings. The final result is the Word Error Rate (WER) and Character Error Rate (CER) for the model.
 Evaluation script: [evaluation.sh](evaluation.sh)
 **Test Dataset WER/CER 69.25%/42.13%**

 	tfm.build(input_filepath= "<AUDIO FILE PATH>", output_filepath= "<RESAMPLED AUDIO FILE PATH>")
 ```
 ## Training
+We used the official [NeMo documentation on training an ASR model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/examples/kinyarwanda_asr.html)
+to prepare our transcript manifest and train our model. However, we did not train any custom tokenizer and instead downloaded the tokenizer
+from [banglaBERT-large](https://huggingface.co/csebuetnlp/banglabert_large/) for better vocabulary coverage.
+For validation, we have used `29589` samples separated from the training data and processed accordingly.
+**The final  validation score was `22.4% WER` , at epoch `164`.**
 Training script : [training.sh](training.sh)
 ## Evaluation
 `14,016` test samples have been used to evaluate the dataset. The generated output file contains both ground truth and predicted strings. The final result is the Word Error Rate (WER) and Character Error Rate (CER) for the model.
 Evaluation script: [evaluation.sh](evaluation.sh)
 **Test Dataset WER/CER 69.25%/42.13%**