license: cc-by-sa-4.0 | |
datasets: | |
- cjvt/cc_gigafida | |
language: | |
- sl | |
tags: | |
- word case classification | |
--- | |
language: | |
- sl | |
license: cc-by-sa-4.0 | |
--- | |
# T5-slo-word-shape-corrector | |
This T5 model is designed to identify and correct words with incorrect shapes. | |
## Model Output Example | |
Imagine we have the following Slovenian text: | |
_Model v besedilu popravljaj besede, ki imeti nepravilno obliko._ | |
The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!): | |
_Model v besedilu popravlja besede, ki imajo nepravilno obliko._ | |
We observe that in the input sentence, the words `popravljaj` and `imeti` are written with incorrect gender and inclination based on the context. Our model corrects them to `popravlja` and `imajo`. | |
## More details | |
Testing the model with generated test sets provides the following result (combining detection and correction of words with incorrect shapes): | |
- `Precission`: 0,911 | |
- `Recall`:0,811 | |
- `F1`: 0,858 | |