Yehor Smoliakov commited on
Commit
1d558db
·
2 Parent(s): b8b4f77 b050338

Merge branch 'main' of https://huggingface.co/Yehor/kenlm-ukrainian into main

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md CHANGED
@@ -4,5 +4,25 @@ license: cc-by-nc-sa-4.0
4
 
5
  This repository contains KenLM models for the Ukrainian language
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  Attribution to the NEWS models:
8
  - Chaplynskyi, D. et al. (2021) lang-uk Ukrainian Ubercorpus [Data set]. https://lang.org.ua/uk/corpora/#anchor4
 
4
 
5
  This repository contains KenLM models for the Ukrainian language
6
 
7
+ Metrics for the NEWS models (tested with an acoustic model of [wav2vec2-xls-r-300m model](https://huggingface.co/Yehor/wav2vec2-xls-r-300m-uk-with-small-lm)):
8
+
9
+ | Model | CER | WER |
10
+ |-|-|-|
11
+ | no LM | 0.0412 | 0.2206 |
12
+ | lm-3gram-50k | 0.0348 | 0.1826 |
13
+ | lm-4gram-50k | 0.0347 | 0.1818 |
14
+ | lm-5gram-50k | 0.0347 | 0.1821 |
15
+ | lm-3gram-100k | 0.031 | 0.1588 |
16
+ | lm-4gram-100k | 0.0308 | 0.1579 |
17
+ | lm-5gram-100k | 0.0308 | 0.1579 |
18
+ | lm-3gram-300k | 0.0261 | 0.1294 |
19
+ | lm-4gram-300k | 0.0261 | 0.1293 |
20
+ | lm-5gram-300k | 0.0261 | 0.1293 |
21
+ | lm-3gram-500k | 0.0248 | 0.1209 |
22
+ | lm-4gram-500k | 0.0247 | 0.1207 |
23
+ | lm-5gram-500k | 0.0247 | 0.1209 |
24
+
25
+ Files of the models are under the Files and versions section.
26
+
27
  Attribution to the NEWS models:
28
  - Chaplynskyi, D. et al. (2021) lang-uk Ukrainian Ubercorpus [Data set]. https://lang.org.ua/uk/corpora/#anchor4