Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Our models are intended for academic use only. If you are not affiliated with an academic institution, please provide a rationale for using our models.

Log in or Sign Up to review the conditions and access this model content.

[README UNDER CONSTRUCTION]

emBert is a Hungarian text classification model, aimed at classifying 7 possible emotions and a neutral state. The model uses huBERT tokenizer, and was fine-tuned on a huBERT base model with a proprietary database of Hungarian online news site sentences. The sentences for the fine-tuning set were classified manually by experts in a double-blind manner. Inconsistencies were dealt with manually. The results of the fine-tuning validation were:

emotion precision recall f1-score
0 - Anger 0.70 0.74 0.72
1 - Disgust 0.72 0.73 0.73
2 - Fear 0.61 0.47 0.53
3 - Happiness 0.38 0.37 0.38
4 - Neutral 0.65 0.62 0.63
5 - Sad 0.74 0.72 0.73
6 - Successful 0.79 0.81 0.80
7 - Trustful 0.76 0.78 0.77
weighted avg 0.73 0.74 0.73
Accuracy reached 74%.

The emotions are based on Plutchik 1980, with anticipation substituted with neutral.

Proper use of the model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("SZTAKI-HLT/hubert-base-cc")

model = AutoModelForSequenceClassification.from_pretrained("poltextlab/emBERT")

The model was created by György Márk Kis, Orsolya Ring, Miklós Sebők of the Center for Social Sciences.

Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.