Slovak T5 Base

Monolingual Slovak model, trained from scratch on web data.

This model have to be fine-tuned for a specific task, does not support any instructions or prefixes yet.

After fine-tuning, it is suitable for tasks such as:

Training data

Trained on the Slovak subset of mc4 dataset with NanoT5 with default settings.

The training corpus has together 14B tokens after deduplication.

It consists of the Slovak data from:

After finetuning for question answering on SK-QUAD, it gives:

The model is published as it is. We did not make any specific attempts to clean up the data.

Free for scientific and commercial use under the terms of: cc-by-sa-4.0

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Finetunes