Panda0116's picture
Upload README.md with huggingface_hub
7db05a7 verified
# emotion-classification-model
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [dair-ai/emotion dataset](https://huggingface.co/datasets/dair-ai/emotion). It is designed to classify text into the following emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5).
It achieves the following results:
- **Validation Accuracy:** 94.25%
- **Test Accuracy:** 93.2%
## Model Description
This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection.
### Key Features
- Efficient and lightweight for deployment.
- High accuracy for emotion detection tasks.
- Pretrained on a diverse dataset and fine-tuned for high specificity to emotions.
## Intended Uses & Limitations
### Intended Uses
- Emotion analysis in text data.
- Sentiment detection in customer reviews, tweets, or user feedback.
- Psychological or behavioral studies to analyze emotional tone in communications.
### Limitations
- May not generalize well to datasets with highly domain-specific language.
- Might struggle with sarcasm, irony, or other nuanced forms of language.
- The model is English-specific and may not perform well on non-English text.
## Training and Evaluation Data
### Training Dataset
- **Dataset:** [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion)
- **Training Set Size:** 16,000 examples
- **Dataset Description:** The dataset contains English sentences labeled with six emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5)
### Results
- **Training Time:** ~226 seconds
- **Training Loss:** 0.0520
- **Validation Accuracy:** 94.25%
- **Test Accuracy:** 93.2%
## Training Procedure
### Hyperparameters
- **Learning Rate:** 5e-05
- **Batch Size:** 16 (train and evaluation)
- **Epochs:** 3
- **Seed:** 42
- **Optimizer:** AdamW (betas=(0.9,0.999), epsilon=1e-08)
- **Learning Rate Scheduler:** Linear
- **Mixed Precision Training:** Native AMP
### Training and Validation Results
| Epoch | Training Loss | Validation Loss | Validation Accuracy |
|-------|---------------|-----------------|---------------------|
| 1 | 0.5383 | 0.1845 | 92.90% |
| 2 | 0.2254 | 0.1589 | 93.55% |
| 3 | 0.0520 | 0.1485 | 94.25% |
### Final Evaluation
- **Validation Loss:** 0.1485
- **Validation Accuracy:** 94.25%
- **Test Loss:** 0.1758
- **Test Accuracy:** 93.2%
### Performance Metrics
- **Training Speed:** ~212 samples/second
- **Evaluation Speed:** ~1144 samples/second
## Usage Example
```python
from transformers import pipeline
# Load the fine-tuned model
classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model")
# Example usage
text = "I am so happy to see you!"
emotion = classifier(text)
print(emotion)
```