azizbarank
commited on
Commit
·
dda0cb4
1
Parent(s):
98828ff
Update README.md
Browse files
README.md
CHANGED
@@ -2,18 +2,16 @@ Distilled version of the [RoBERTa](https://huggingface.co/textattack/roberta-bas
|
|
2 |
|
3 |
## Modifications to the original RoBERTa model:
|
4 |
|
5 |
-
The final distilled model was able to achieve
|
6 |
|
7 |
|
8 |
-
##
|
9 |
| Epoch | Training Loss | Validation Loss | Accuracy |
|
10 |
| ----------------- | ------------ | --------- | ---------- |
|
11 |
-
|1 | 0.
|
12 |
-
|2 | 0.
|
13 |
-
|3 | 0.
|
14 |
-
|4 | 0.
|
15 |
-
|5 | 0.105100 | 0.449959 | 0.917431 |
|
16 |
-
|6 | 0.081800 | 0.452210 | 0.916284 |
|
17 |
|
18 |
|
19 |
## Usage
|
|
|
2 |
|
3 |
## Modifications to the original RoBERTa model:
|
4 |
|
5 |
+
The final distilled model was able to achieve 92% accuracy on the SST-2 dataset. Given the original RoBERTa achieves 94.8% accuracy on the same dataset with much more parameters (125M) and that the distilled model is nearly twice as fast as it is, the accuracy is an impressive result.
|
6 |
|
7 |
|
8 |
+
## Training Results after Hyperparameter Tuning
|
9 |
| Epoch | Training Loss | Validation Loss | Accuracy |
|
10 |
| ----------------- | ------------ | --------- | ---------- |
|
11 |
+
|1 | 0.144000 | 0.379220 | 0.907110 |
|
12 |
+
|2 | 0.108500 | 0.466671 | 0.911697 |
|
13 |
+
|3 | 0.078600 | 0.359551 | 0.915138 |
|
14 |
+
|4 | 0.057400 | 0.358214 | 0.920872 |
|
|
|
|
|
15 |
|
16 |
|
17 |
## Usage
|