cjvt
/

t5-slo-word-spelling-corrector

Text2Text Generation

word spelling correction

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

t5-slo-word-spelling-corrector / README.md

Martin97Bozic's picture

Update README.md

379d0d1 about 1 year ago

|

1.76 kB

	---
	license: cc-by-sa-4.0
	datasets:
	- cjvt/cc_gigafida
	- cjvt/solar3
	- cjvt/sloleks
	language:
	- sl
	tags:
	- word spelling correction
	---

	---
	language:
	- sl

	license: cc-by-sa-4.0
	---

	# T5-incorrect-word-spelling-corrector

	This T5 model is designed to identify and correct words with incorrect spelling in the Slovenian language.

	## Model Output Example

	Consider the following Slovenian text:

	_Model v besedlu popravi napaake v nepravilno črkovanih besedah._

	The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!):

	_Model v besedilu popravi napake v nepravilno črkovanih besedah._

	We observe that in the input sentence, the words `besedlu` and `napaake` are incorrectly spelled, so the model corrects them to `besedilu` and `napake`.

	## More details

	Testing the model with generated test sets provides the following result (combining detection and correction of words with incorrect spelling):

	- `Precission`: 0,986
	- `Recall`: 0,935
	- `F1`: 0,960

	Testing the model, in combination with cjvt/SloBERTa-slo-word-spelling-annotator, with test sets constructed using the Šolar Eval dataset provides the following results (combining detection and correction of words with incorrect spelling):

	- `Precission`: 0,823
	- `Recall`: 0,796
	- `F1`: 0,810

	## Acknowledgement

	The authors acknowledge the financial support from the Slovenian Research and Innovation Agency - research core funding No. P6-0411: Language Resources and Technologies for Slovene and research project No. J7-3159: Empirical foundations for digitally-supported development of writing skills.

	## Authors

	Thanks to Martin Božič, Marko Robnik-Šikonja and Špela Arhar Holdt for developing these models.