metadata

library_name: transformers
license: apache-2.0
base_model: distilbert/distilroberta-base
tags:
  - generated_from_trainer
  - sentiment_analysis
model-index:
  - name: augmented-go-emotions-plus-other-datasets-fine-tuned-distilroberta-v2
    results: []
datasets:
  - google-research-datasets/go_emotions
language:
  - en
metrics:
  - f1
  - precision
  - recall

augmented-go-emotions-plus-other-datasets-fine-tuned-distilroberta-v2

This model is a fine-tuned version of distilbert/distilroberta-base on the these datasets:

GoEmotions
sem_eval_2018_task_1 (English)
Emotion Detection from Text - Pashupati Gupta
Emotions dataset for NLP - praveengovi It has also been data augmented using TextAttack. On top of the (first version)[https://huggingface.co/paradoxmaske/augmented-go-emotions-plus-other-datasets-fine-tuned-distilroberta] of the model, V2 added more data augmentation (EasyDataAugmenter) on all labels except 'neutral'.

It achieves the following results on the evaluation set:

Loss: 0.0792
Micro Precision: 0.6922
Micro Recall: 0.5854
Micro F1: 0.6343
Macro Precision: 0.5809
Macro Recall: 0.4729
Macro F1: 0.5136
Weighted Precision: 0.6764
Weighted Recall: 0.5854
Weighted F1: 0.6238
Hamming Loss: 0.0287

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Micro Precision	Micro Recall	Micro F1	Macro Precision	Macro Recall	Macro F1	Weighted Precision	Weighted Recall	Weighted F1	Hamming Loss
No log	1.0	18858	0.0745	0.7528	0.5169	0.6129	0.6155	0.3805	0.4336	0.7386	0.5169	0.5827	0.0278
No log	2.0	37716	0.0757	0.7102	0.5616	0.6272	0.5937	0.4658	0.5049	0.6978	0.5616	0.6105	0.0284
No log	3.0	56574	0.0792	0.6922	0.5854	0.6343	0.5809	0.4729	0.5136	0.6764	0.5854	0.6238	0.0287

Test results

Label	Precision	Recall	F1-Score	Support
admiration	0.65	0.66	0.66	504
amusement	0.71	0.84	0.77	264
anger	0.80	0.70	0.74	1585
annoyance	0.44	0.25	0.32	320
approval	0.47	0.32	0.38	351
caring	0.37	0.31	0.34	135
confusion	0.41	0.42	0.42	153
curiosity	0.50	0.42	0.46	284
desire	0.47	0.35	0.40	83
disappointment	0.31	0.16	0.21	151
disapproval	0.42	0.29	0.35	267
disgust	0.72	0.63	0.67	1222
embarrassment	0.52	0.35	0.42	37
excitement	0.43	0.39	0.41	103
fear	0.79	0.76	0.78	787
gratitude	0.92	0.89	0.90	352
grief	0.00	0.00	0.00	6
joy	0.87	0.77	0.81	2298
love	0.69	0.61	0.65	1305
nervousness	0.43	0.26	0.32	23
optimism	0.72	0.57	0.64	1329
pride	0.62	0.31	0.42	16
realization	0.39	0.19	0.26	145
relief	0.26	0.24	0.25	160
remorse	0.56	0.75	0.64	56
sadness	0.75	0.69	0.72	2212
surprise	0.51	0.35	0.41	572
neutral	0.67	0.51	0.58	2668
Micro Avg	0.71	0.60	0.65	17388
Macro Avg	0.55	0.46	0.50	17388
Weighted Avg	0.70	0.60	0.64	17388
Samples Avg	0.64	0.61	0.61	17388

Framework versions

Transformers 4.47.0
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.21.0