emotion-classification-model

This model is a fine-tuned version of distilbert-base-uncased on the dair-ai/emotion dataset. It is designed to classify text into the following emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5).

It achieves the following results:

Validation Accuracy: 94.25%
Test Accuracy: 93.2%

Model Description

This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection.

Key Features

Efficient and lightweight for deployment.
High accuracy for emotion detection tasks.
Pretrained on a diverse dataset and fine-tuned for high specificity to emotions.

Intended Uses & Limitations

Intended Uses

Emotion analysis in text data.
Sentiment detection in customer reviews, tweets, or user feedback.
Psychological or behavioral studies to analyze emotional tone in communications.

Limitations

May not generalize well to datasets with highly domain-specific language.
Might struggle with sarcasm, irony, or other nuanced forms of language.
The model is English-specific and may not perform well on non-English text.

Training and Evaluation Data

Training Dataset

Dataset: dair-ai/emotion
Training Set Size: 16,000 examples
Dataset Description: The dataset contains English sentences labeled with six emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5)

Results

Training Time: ~226 seconds
Training Loss: 0.0520
Validation Accuracy: 94.25%
Test Accuracy: 93.2%

Training Procedure

Hyperparameters

Learning Rate: 5e-05
Batch Size: 16 (train and evaluation)
Epochs: 3
Seed: 42
Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
Learning Rate Scheduler: Linear
Mixed Precision Training: Native AMP

Training and Validation Results

Epoch	Training Loss	Validation Loss	Validation Accuracy
1	0.5383	0.1845	92.90%
2	0.2254	0.1589	93.55%
3	0.0520	0.1485	94.25%

Final Evaluation

Validation Loss: 0.1485
Validation Accuracy: 94.25%
Test Loss: 0.1758
Test Accuracy: 93.2%

Performance Metrics

Training Speed: ~212 samples/second
Evaluation Speed: ~1144 samples/second

Usage Example

from transformers import pipeline

# Load the fine-tuned model
classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model")

# Example usage
text = "I am so happy to see you!"
emotion = classifier(text)
print(emotion)