File size: 2,932 Bytes

99bd8f3


# emotion-classification-model

This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [dair-ai/emotion dataset](https://huggingface.co/datasets/dair-ai/emotion). It is designed to classify text into various emotional categories. 

It achieves the following results:
- **Validation Accuracy:** 97.68%
- **Test Accuracy:** 94.25%

## Model Description

This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection.

### Key Features
- Efficient and lightweight for deployment.
- High accuracy for emotion detection tasks.
- Pretrained on a diverse dataset and fine-tuned for high specificity to emotions.

## Intended Uses & Limitations

### Intended Uses
- Emotion analysis in text data.
- Sentiment detection in customer reviews, tweets, or user feedback.
- Psychological or behavioral studies to analyze emotional tone in communications.

### Limitations
- May not generalize well to datasets with highly domain-specific language.
- Might struggle with sarcasm, irony, or other nuanced forms of language.
- The model is English-specific and may not perform well on non-English text.

## Training and Evaluation Data

### Training Dataset
- **Dataset:** [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion)
- **Training Set Size:** 16,000 examples
- **Dataset Description:** The dataset contains English sentences labeled with six emotional categories: anger, joy, optimism, sadness, fear, and disgust.

### Results
- **Training Time:** ~226 seconds
- **Training Loss:** 0.1987
- **Validation Accuracy:** 97.68%
- **Test Accuracy:** 94.25%

## Training Procedure

### Hyperparameters
- **Learning Rate:** 5e-05
- **Batch Size:** 16 (train and evaluation)
- **Epochs:** 3
- **Seed:** 42
- **Optimizer:** AdamW (betas=(0.9,0.999), epsilon=1e-08)
- **Learning Rate Scheduler:** Linear
- **Mixed Precision Training:** Native AMP

### Training and Validation Results

| Epoch | Training Loss | Validation Loss | Validation Accuracy |
|-------|---------------|-----------------|---------------------|
| 1     | 0.5383        | 0.1845          | 92.9%              |
| 2     | 0.2254        | 0.1589          | 93.55%             |
| 3     | 0.0739        | 0.0520          | 97.68%             |

### Test Results
- **Loss:** 0.1485
- **Accuracy:** 94.25%

### Performance Metrics
- **Training Speed:** ~212 samples/second
- **Evaluation Speed:** ~1149 samples/second

## Usage Example

```python
from transformers import pipeline

# Load the fine-tuned model
classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model")

# Example usage
text = "I am so happy to see you!"
emotion = classifier(text)
print(emotion)
```