--- license: cc-by-sa-4.0 datasets: - procesaur/Vikipedija - procesaur/Vikizvornik - procesaur/ZNANJE - jerteh/SrpELTeC - procesaur/kisobran language: - sr ---

Word2Vec Sr

Обучаван над корпусом српског језика - 9.5 милијарди речи

Међу датотекама се налазе два модела (CBOW и SkipGram варијанте)

Trained on the Serbian language corpus - 9.5 billion words

There are two models among the files (CBOW and SkipGram variants)

```python from gensim.models import Word2Vec model = Word2Vec.load("TeslaSG") examples = [ ("dim", "zavesa"), ("staklo", "zavesa"), ("ormar", "zavesa"), ("prozor", "zavesa"), ("draperija", "zavesa") ] for e in examples: model.wv.similarity(e[0], e[1])) ``` ``` 0.5193785 0.5763144 0.59982747 0.6022524 0.7117646 ```

Author

Mihailo Škorić

@procesaur

Computation

TESLA project

@te-sla

Истраживање jе спроведено уз подршку Фонда за науку Републике Србиjе, #7276, Text Embeddings – Serbian Language Applications – TESLA

This research was supported by the Science Fund of the Republic of Serbia, #7276, Text Embeddings - Serbian Language Applications - TESLA