Kenlm language model

by GaetanBaert - opened Sep 22, 2022

Sep 22, 2022

Hello,
Which dataset did you use to train the Kenlm model ?
Also, what parameters did you use ?

Owner Dec 15, 2022

Hello @GaetanBaert ! To build the LM, I've used the data from Common Voice 8.0, MediaSpeech, Multilingual TEDx, Multilingual LibriSpeech, and Voxpopuli. I've transformed all the text to lowercase and removed the punctuation.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment