Kenlm language model
#1
by
GaetanBaert
- opened
Hello,
Which dataset did you use to train the Kenlm model ?
Also, what parameters did you use ?
Hello @GaetanBaert ! To build the LM, I've used the data from Common Voice 8.0, MediaSpeech, Multilingual TEDx, Multilingual LibriSpeech, and Voxpopuli. I've transformed all the text to lowercase and removed the punctuation.