File size: 3,096 Bytes
99bd8f3
 
 
7db05a7
99bd8f3
 
af1c615
 
99bd8f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99d604b
99bd8f3
 
 
af1c615
 
 
99bd8f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af1c615
99bd8f3
af1c615
99bd8f3
af1c615
 
 
 
 
99bd8f3
 
 
af1c615
99bd8f3
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86

# emotion-classification-model

This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [dair-ai/emotion dataset](https://huggingface.co/datasets/dair-ai/emotion). It is designed to classify text into the following emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5). 

It achieves the following results:
- **Validation Accuracy:** 94.25%
- **Test Accuracy:** 93.2%

## Model Description

This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection.

### Key Features
- Efficient and lightweight for deployment.
- High accuracy for emotion detection tasks.
- Pretrained on a diverse dataset and fine-tuned for high specificity to emotions.

## Intended Uses & Limitations

### Intended Uses
- Emotion analysis in text data.
- Sentiment detection in customer reviews, tweets, or user feedback.
- Psychological or behavioral studies to analyze emotional tone in communications.

### Limitations
- May not generalize well to datasets with highly domain-specific language.
- Might struggle with sarcasm, irony, or other nuanced forms of language.
- The model is English-specific and may not perform well on non-English text.

## Training and Evaluation Data

### Training Dataset
- **Dataset:** [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion)
- **Training Set Size:** 16,000 examples
- **Dataset Description:** The dataset contains English sentences labeled with six emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5)

### Results
- **Training Time:** ~226 seconds
- **Training Loss:** 0.0520
- **Validation Accuracy:** 94.25%
- **Test Accuracy:** 93.2%

## Training Procedure

### Hyperparameters
- **Learning Rate:** 5e-05
- **Batch Size:** 16 (train and evaluation)
- **Epochs:** 3
- **Seed:** 42
- **Optimizer:** AdamW (betas=(0.9,0.999), epsilon=1e-08)
- **Learning Rate Scheduler:** Linear
- **Mixed Precision Training:** Native AMP

### Training and Validation Results

| Epoch | Training Loss | Validation Loss | Validation Accuracy |
|-------|---------------|-----------------|---------------------|
| 1     | 0.5383        | 0.1845          | 92.90%             |
| 2     | 0.2254        | 0.1589          | 93.55%             |
| 3     | 0.0520        | 0.1485          | 94.25%             |

### Final Evaluation
- **Validation Loss:** 0.1485
- **Validation Accuracy:** 94.25%
- **Test Loss:** 0.1758
- **Test Accuracy:** 93.2%

### Performance Metrics
- **Training Speed:** ~212 samples/second
- **Evaluation Speed:** ~1144 samples/second

## Usage Example

```python
from transformers import pipeline

# Load the fine-tuned model
classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model")

# Example usage
text = "I am so happy to see you!"
emotion = classifier(text)
print(emotion)
```