|
--- |
|
language: en |
|
license: apache-2.0 |
|
--- |
|
|
|
# Women's Clothing Reviews Sentiment Analysis with DistilBERT |
|
|
|
## Overview |
|
|
|
This Hugging Face repository contains a fine-tuned DistilBERT model for sentiment analysis of women's clothing reviews. The model is designed to classify reviews into positive, negative, or neutral sentiment categories, providing valuable insights into customer opinions. |
|
|
|
## Model Details |
|
|
|
- **Model Architecture**: Fine-tuned DistilBERT |
|
- **Sentiment Categories**: Neutral [0], Negative [1], Positive [2] |
|
- **Input Format**: Text-based clothing reviews |
|
- **Output Format**: Sentiment category labels |
|
|
|
## Fine-tuning procedure |
|
This model was fine-tuned using a relatively small dataset containing 23487 rows broken down into train/eval/test dataset. Nevertheless, the fine-tuned model was able to performs slightly better than the base-distilbert-model on the test dataset. |
|
|
|
|
|
## Training result |
|
It achieved the following results on the evaluation set: |
|
- **Validation Loss**: 1.1677 |
|
|
|
### Comparison between the base distilbert model VS fine-tuned distilbert |
|
| Model | Accuracy | Precision | Recall | F1 Score | |
|
|--------------- | -------- | --------- | ------ | -------- | |
|
| DistilBERT base model | 0.79 | 0.77 | 0.79 | 0.77 | |
|
| DistilBERT fine-tuned | 0.85 | 0.86 | 0.85 | 0.85 | |
|
|
|
|
|
## Installation |
|
|
|
To use this model, you'll need to install the Hugging Face Transformers library and any additional dependencies. |
|
- **pip install transformers** |
|
- **pip install torch** |
|
|
|
|
|
## Usage |
|
You can easily load the pre-trained model for sentiment analysis using Hugging Face's DistilBertForSequenceClassification and DistilBertTokenizerFast. |
|
|
|
```python |
|
from transformers import DistilBertForSequenceClassification, DistilBertTokenizerFast |
|
import torch |
|
|
|
model_name = "ongaunjie/distilbert-cloths-sentiment" |
|
tokenizer = DistilBertTokenizerFast.from_pretrained(model_name) |
|
model = DistilBertForSequenceClassification.from_pretrained(model_name) |
|
|
|
review = "This dress is amazing, I love it!" |
|
inputs = tokenizer.encode(review, return_tensors="pt") |
|
with torch.no_grad(): |
|
outputs = model(inputs) |
|
predicted_class = int(torch.argmax(outputs.logits)) |
|
|