Panda0116's picture
Upload README.md with huggingface_hub
7db05a7 verified

emotion-classification-model

This model is a fine-tuned version of distilbert-base-uncased on the dair-ai/emotion dataset. It is designed to classify text into the following emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5).

It achieves the following results:

  • Validation Accuracy: 94.25%
  • Test Accuracy: 93.2%

Model Description

This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection.

Key Features

  • Efficient and lightweight for deployment.
  • High accuracy for emotion detection tasks.
  • Pretrained on a diverse dataset and fine-tuned for high specificity to emotions.

Intended Uses & Limitations

Intended Uses

  • Emotion analysis in text data.
  • Sentiment detection in customer reviews, tweets, or user feedback.
  • Psychological or behavioral studies to analyze emotional tone in communications.

Limitations

  • May not generalize well to datasets with highly domain-specific language.
  • Might struggle with sarcasm, irony, or other nuanced forms of language.
  • The model is English-specific and may not perform well on non-English text.

Training and Evaluation Data

Training Dataset

  • Dataset: dair-ai/emotion
  • Training Set Size: 16,000 examples
  • Dataset Description: The dataset contains English sentences labeled with six emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5)

Results

  • Training Time: ~226 seconds
  • Training Loss: 0.0520
  • Validation Accuracy: 94.25%
  • Test Accuracy: 93.2%

Training Procedure

Hyperparameters

  • Learning Rate: 5e-05
  • Batch Size: 16 (train and evaluation)
  • Epochs: 3
  • Seed: 42
  • Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
  • Learning Rate Scheduler: Linear
  • Mixed Precision Training: Native AMP

Training and Validation Results

Epoch Training Loss Validation Loss Validation Accuracy
1 0.5383 0.1845 92.90%
2 0.2254 0.1589 93.55%
3 0.0520 0.1485 94.25%

Final Evaluation

  • Validation Loss: 0.1485
  • Validation Accuracy: 94.25%
  • Test Loss: 0.1758
  • Test Accuracy: 93.2%

Performance Metrics

  • Training Speed: ~212 samples/second
  • Evaluation Speed: ~1144 samples/second

Usage Example

from transformers import pipeline

# Load the fine-tuned model
classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model")

# Example usage
text = "I am so happy to see you!"
emotion = classifier(text)
print(emotion)