Automatic Speech Recognition for Belarusian language

Fine-tuned version of facebook/wav2vec2-base on mozilla-foundation/common_voice_8_0 be dataset.

Train, Dev, Test splits were used as they are present in the dataset. No additional data was used from Validated split, only 1 voicing of each sentence was used - the way the data was split by CommonVoice CorporaCreator. To build a better model one can use additional voicings from Validated split for sentences already present in Train, Dev, Test splits, i.e. enlarge mentioned splits.

Language model was built using KenLM. 5-gram Language model was built on sentences from Train + (Other - Dev - Test) splits of mozilla-foundation/common_voice_8_0 be dataset.

Source code is available here.

Run model in a browser

This page contains interactive demo widget that lets you test this model right in a browser.

However, this widget uses Acoustic model only without Language model that significantly improves overall performance.

You can play with full pipeline of Acoustic model + Language model on the following spaces page (also works from browser).

Downloads last month
83
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ales/wav2vec2-cv-be

Space using ales/wav2vec2-cv-be 1

Evaluation results