poltextlab/emBERT · Hugging Face

[README UNDER CONSTRUCTION]

emBert is a Hungarian text classification model, aimed at classifying 7 possible emotions and a neutral state. The model uses huBERT tokenizer, and was fine-tuned on a huBERT base model with a proprietary database of Hungarian online news site sentences. The sentences for the fine-tuning set were classified manually by experts in a double-blind manner. Inconsistencies were dealt with manually. The results of the fine-tuning validation were:

emotion	precision	recall	f1-score
0 - Anger	0.70	0.74	0.72
1 - Disgust	0.72	0.73	0.73
2 - Fear	0.61	0.47	0.53
3 - Happiness	0.38	0.37	0.38
4 - Neutral	0.65	0.62	0.63
5 - Sad	0.74	0.72	0.73
6 - Successful	0.79	0.81	0.80
7 - Trustful	0.76	0.78	0.77
weighted avg	0.73	0.74	0.73
Accuracy reached 74%.

The emotions are based on Plutchik 1980, with anticipation substituted with neutral.

Proper use of the model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("SZTAKI-HLT/hubert-base-cc")

model = AutoModelForSequenceClassification.from_pretrained("poltextlab/emBERT")

The model was created by György Márk Kis, Orsolya Ring, Miklós Sebők of the Center for Social Sciences.

poltextlab
/

emBERT

You need to agree to share your contact information to access this model