This is the cointegrated/rubert-tiny2 model fine-tuned for classification of emotions in Russian sentences. The task is multilabel classification, because one sentence can contain multiple emotions.

The model on the CEDR dataset described in the paper "Data-Driven Model for Emotion Detection in Russian Texts" by Sboev et al.

The model has been trained with Adam optimizer for 40 epochs with learning rate 1e-5 and batch size 64 in this notebook.

The quality of the predicted probabilities on the test dataset is the following:

label	no emotion	joy	sadness	surprise	fear	anger	mean	mean (emotions)
AUC	0.9286	0.9512	0.9564	0.8908	0.8955	0.7511	0.8956	0.8890
F1 micro	0.8624	0.9389	0.9362	0.9469	0.9575	0.9261	0.9280	0.9411
F1 macro	0.8562	0.8962	0.9017	0.8366	0.8359	0.6820	0.8348	0.8305

Downloads last month: 2,721

Safetensors

Model size

29.2M params

Tensor type

I64

F32

cointegrated
/

rubert-tiny2-cedr-emotion-detection

Dataset used to train cointegrated/rubert-tiny2-cedr-emotion-detection

Spaces using cointegrated/rubert-tiny2-cedr-emotion-detection 7