azizbarank commited on
Commit
9675b54
·
1 Parent(s): 4f0dee2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Distilled version of the [RoBERTa](https://huggingface.co/textattack/roberta-base-SST-2) model fine-tuned on the SST-2 part of the GLUE dataset. It was obtained from the "teacher" RoBERTa model by using task-specific knowledge distillation. Since it was fine-tuned on the SST-2, the final model is ready to be used in sentiment analysis tasks.
2
+
3
+ ## Modifications to the original RoBERTa model:
4
+
5
+ The final distilled model was able to achieve 91.6% accuracy on the SST-2 dataset with only 85M parameters. Given the original RoBERTa achieves 92.5% accuracy on the same dataset with much more parameters (125M), it is an impressive result.
6
+
7
+ ### Tabular Comparison:
8
+
9
+ | Modifications | Original RoBERTa | distilroberta-sst-2-distilled |
10
+ | ----------------- | ------------------- | ---------------------- |
11
+ |Parameters | 125M | 85M |
12
+ |Performance on SST-2 | 92.5 | 91.6 |
13
+
14
+
15
+ ## Evaluation & Training Results
16
+ | Epoch | Training Loss | Validation Loss | Accuracy |
17
+ | ----------------- | ------------ | --------- | ---------- |
18
+ |1 | 0.819500 | 0.547877 | 0.904817 |
19
+ |2 | 0.308400 | 0.616938 | 0.900229 |
20
+ |3 | 0.193600 | 0.496516 | 0.912844 |
21
+ |4 | 0.136300 | 0.486479 | 0.917431 |
22
+ |5 | 0.105100 | 0.449959 | 0.917431 |
23
+ |6 | 0.081800 | 0.452210 | 0.916284 |
24
+
25
+
26
+ ## Usage
27
+
28
+ To use the model from the 🤗/transformers library
29
+
30
+ ```python
31
+ # !pip install transformers
32
+
33
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
34
+
35
+ tokenizer = AutoTokenizer.from_pretrained("azizbarank/distilroberta-base-sst2-distilled")
36
+
37
+ model = AutoModelForSequenceClassification.from_pretrained("azizbarank/distilroberta-base-sst2-distilled")
38
+ ```