README.md · te-sla/Word2VecSr at d3e6b451885180180d5c0fb4b112c5857426ac50

metadata

license: cc-by-sa-4.0
datasets:
  - procesaur/Vikipedija
  - procesaur/Vikizvornik
  - procesaur/ZNANJE
  - jerteh/SrpELTeC
language:
  - sr

Word2Vec

обучаван над корпусом српског језика - 9.5 милијарди речи

међу датотекама се налазе два модела (CBOW и SkipGram варијанте)

trained on the Serbian language corpus - 9.5 billion words

There are two models among the files (CBOW and SkipGram variants)

from gensim.models import Word2Vec
model = Word2Vec.load("TeslaW2V")

Editor

Mihailo Škorić

@procesaur

Истраживање jе спроведено уз подршку Фонда за науку Републике Србиjе, #7276, Text Embeddings – Serbian Language Applications – TESLA и Фода за иновациону делатност Републике Србије у оквиру програма GOVTECH, решење #53096, Дигитекс

This research was supported by the Science Fund of the Republic of Serbia, #7276, Text Embeddings - Serbian Language Applications - TESLA and the Fund for Republic of Serbia Innovation Fund via the GOVTECH program, project #53096, Digiteks