atasoglu commited on
Commit
823d77b
1 Parent(s): c6fa84c

update readme

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -1,3 +1,29 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - tr
5
+ multilinguality:
6
+ - monolingual
7
+ pretty_name: truecase-tr-wiki
8
+ tags:
9
+ - truecase
10
+ task_categories:
11
+ - feature-extraction
12
  ---
13
+
14
+ Pretrained [truecase](https://github.com/daltonfury42/truecase) model for Turkish case fix.
15
+
16
+ Trained on [this](https://dumps.wikimedia.org/trwiki/20230301/) Wiki corpus. Due to lack of RAM, only 40% of the corpus (more than 260K unique tokens) was used for training.
17
+
18
+ **Example:**
19
+
20
+ ```console
21
+ >>> from truecase import TrueCaser
22
+ >>> tc = TrueCaser('turkish.dist')
23
+ >>> tc.get_true_case("önemli iki nato üyesi ülke abd ve türkiye")
24
+ 'Önemli iki NATO üyesi ülke ABD ve Türkiye'
25
+ >>> tc.get_true_case("ayşe, ahmet ve zeynep hep birlikte antalyaya tatile gitti")
26
+ 'Ayşe, Ahmet ve Zeynep hep birlikte Antalyaya tatile gitti'
27
+ >>> tc.get_true_case("kurtuluş savaşı atatürkün samsuna çıkışıyla başladı")
28
+ 'Kurtuluş Savaşı Atatürkün Samsuna çıkışıyla başladı'
29
+ ```