SeTABERTa-mlm-v1 / README.md
alession's picture
Update README.md
955c988 verified
|
raw
history blame
385 Bytes
---
license: eupl-1.1
---
SeTABERTa is a new multilingual language model pretained from scratch using various Open Access text repositories: EU legislation, research articles, EU public documents and US patents.
2/3 of training data is English. The other part of data covers EU24 languages.
The model was trained on JRC Big Data Platform. The model can be fine-tuned for other tasks.