emotion-classification-model
This model is a fine-tuned version of distilbert-base-uncased on the dair-ai/emotion dataset. It is designed to classify text into the following emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5).
It achieves the following results:
- Validation Accuracy: 94.25%
- Test Accuracy: 93.2%
Model Description
This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection.
Key Features
- Efficient and lightweight for deployment.
- High accuracy for emotion detection tasks.
- Pretrained on a diverse dataset and fine-tuned for high specificity to emotions.
Intended Uses & Limitations
Intended Uses
- Emotion analysis in text data.
- Sentiment detection in customer reviews, tweets, or user feedback.
- Psychological or behavioral studies to analyze emotional tone in communications.
Limitations
- May not generalize well to datasets with highly domain-specific language.
- Might struggle with sarcasm, irony, or other nuanced forms of language.
- The model is English-specific and may not perform well on non-English text.
Training and Evaluation Data
Training Dataset
- Dataset: dair-ai/emotion
- Training Set Size: 16,000 examples
- Dataset Description: The dataset contains English sentences labeled with six emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5)
Results
- Training Time: ~226 seconds
- Training Loss: 0.0520
- Validation Accuracy: 94.25%
- Test Accuracy: 93.2%
Training Procedure
Hyperparameters
- Learning Rate: 5e-05
- Batch Size: 16 (train and evaluation)
- Epochs: 3
- Seed: 42
- Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
- Learning Rate Scheduler: Linear
- Mixed Precision Training: Native AMP
Training and Validation Results
Epoch | Training Loss | Validation Loss | Validation Accuracy |
---|---|---|---|
1 | 0.5383 | 0.1845 | 92.90% |
2 | 0.2254 | 0.1589 | 93.55% |
3 | 0.0520 | 0.1485 | 94.25% |
Final Evaluation
- Validation Loss: 0.1485
- Validation Accuracy: 94.25%
- Test Loss: 0.1758
- Test Accuracy: 93.2%
Performance Metrics
- Training Speed: ~212 samples/second
- Evaluation Speed: ~1144 samples/second
Usage Example
from transformers import pipeline
# Load the fine-tuned model
classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model")
# Example usage
text = "I am so happy to see you!"
emotion = classifier(text)
print(emotion)