Hungarian word vectors for HuSpaCy.

The model is trained on the Hungarian Webcorpus 2.0 using floret with the following hyperparameters: floret cbow -dim 100 -mode floret -bucket 200000 -minn 4 -maxn 6 -minCount 100 -neg 10 -hashCount 2 -lr 0.1 -thread 30 -epoch 5

Vectors are published in fasttext and floret format.

Feature Description
Name hu_vectors_web_lg
Version 1.0
Vectors 200000 keys (300 dimensions)
Sources Hungarian Webcorpus 2.0 (Dávid Márk Nemeskey (SZTAKI-HLT))
License cc-by-sa-4.0
Author SzegedAI, MILAB

Accuracy

Type Score
ACC 10.10
MRR 0.1772
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.