metadata
license: cc-by-sa-4.0
datasets:
- procesaur/Vikipedija
- procesaur/Vikizvornik
- procesaur/ZNANJE
- jerteh/SrpELTeC
- procesaur/kisobran
language:
- sr
Word2Vec Sr |
|
Обучаван над корпусом српског језика - 9.5 милијарди речи Међу датотекама се налазе два модела (CBOW и SkipGram варијанте) |
Trained on the Serbian language corpus - 9.5 billion words There are two models among the files (CBOW and SkipGram variants) |
from gensim.models import Word2Vec
model = Word2Vec.load("TeslaSG")
examples = [
("dim", "zavesa"),
("staklo", "zavesa"),
("ormar", "zavesa"),
("prozor", "zavesa"),
("draperija", "zavesa")
]
for e in examples:
model.wv.similarity(e[0], e[1]))
0.5193785
0.5763144
0.59982747
0.6022524
0.7117646
Истраживање jе спроведено уз подршку Фонда за науку Републике Србиjе, #7276, Text Embeddings – Serbian Language Applications – TESLA |
This research was supported by the Science Fund of the Republic of Serbia, #7276, Text Embeddings - Serbian Language Applications - TESLA |