cjvt
/

File size: 1,013 Bytes
262f114
 
5dc2037
 
 
 
 
 
262f114
5dc2037
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
license: cc-by-sa-4.0
datasets:
- cjvt/cc_gigafida
language:
- sl
tags:
- word case classification
---

---
language: 
- sl

license: cc-by-sa-4.0
---

# T5-slo-word-shape-corrector

This T5 model is designed to identify and correct words with incorrect shapes.

## Model Output Example

Imagine we have the following Slovenian text:

_Model v besedilu popravljaj besede, ki imeti nepravilno obliko._

The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!):

_Model v besedilu popravlja besede, ki imajo nepravilno obliko._

We observe that in the input sentence, the words `popravljaj` and `imeti` are written with incorrect gender and inclination based on the context. Our model corrects them to `popravlja` and `imajo`.

## More details

Testing the model with generated test sets provides the following result (combining detection and correction of words with incorrect shapes):

- `Precission`: 0,911
- `Recall`:0,811
- `F1`: 0,858