distilbert-base-multilingual-cased-finetuned

This model is a fine-tuned version of distilbert-base-multilingual-cased on the Emotone Arabic dataset, which includes tweets labeled for various emotions: none, anger, joy, sadness, love, sympathy, surprise, and fear. It achieves the following results on the evaluation set:

  • Loss: 1.3099
  • Accuracy: 0.6632
  • F1: 0.6647

Model description

This model is designed for emotion recognition in Arabic text. It can classify tweets into one of the eight emotional categories.

Intended uses & limitations

This model is intended for applications in sentiment analysis and emotion detection in Arabic tweets. It may not perform well on texts outside the domain of social media or on languages other than Arabic.

Training and evaluation data

The model was fine-tuned on the Emotone Arabic dataset, which consists of tweets labeled with the following emotions:

  • none
  • anger
  • joy
  • sadness
  • love
  • sympathy
  • surprise
  • fear

Label Mapping

Label Name Numeric Label
none 0
anger 1
joy 2
sadness 3
love 4
sympathy 5
surprise 6
fear 7

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
1.0026 1.0 252 1.0417 0.6408 0.6321
0.8422 2.0 504 1.0355 0.6508 0.6425
0.7114 3.0 756 1.0611 0.6364 0.6342
0.5709 4.0 1008 1.0672 0.6692 0.6665
0.459 5.0 1260 1.1167 0.6731 0.6693
0.3694 6.0 1512 1.1709 0.6637 0.6672
0.2975 7.0 1764 1.2094 0.6716 0.6699
0.2402 8.0 2016 1.2777 0.6642 0.6633
0.209 9.0 2268 1.2997 0.6692 0.6685
0.1792 10.0 2520 1.3099 0.6632 0.6647

Example Outputs

Here are some example inputs and their corresponding model predictions:

Input Tweet Predicted Emotion Numeric Label
"أنا سعيد جدًا اليوم!" joy 2
"هذا أمر محبط حقًا." sadness 3
"لا أستطيع تحمل هذا بعد الآن." anger 1
"أحب كل من يدعمني." love 4

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
35
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for 0marr/distilbert-base-multilingual-cased-finetuned

Finetuned
(220)
this model