YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Distilled version of the RoBERTa model fine-tuned on the SST-2 part of the GLUE dataset. It was obtained from the "teacher" RoBERTa model by using task-specific knowledge distillation. Since the teacher was fine-tuned on the SST-2, the final model as well is ready to be used in sentiment analysis tasks.

Modifications to the original RoBERTa model:

The final distilled model was able to achieve 92% accuracy on the SST-2 dataset. Given the original RoBERTa achieves 94.8% accuracy on the same dataset with much more parameters (125M) and that the distilled model is nearly twice as fast as it is, the accuracy is an impressive result.

Training Results after Hyperparameter Tuning

Epoch Training Loss Validation Loss Accuracy
1 0.144000 0.379220 0.907110
2 0.108500 0.466671 0.911697
3 0.078600 0.359551 0.915138
4 0.057400 0.358214 0.920872

Usage

To use the model from the 🤗/transformers library

# !pip install transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("azizbarank/distilroberta-base-sst2-distilled")

model = AutoModelForSequenceClassification.from_pretrained("azizbarank/distilroberta-base-sst2-distilled")
Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.