metadata
language: en
Sparse BERT base model (uncased)
Pretrained model pruned to 1:2 structured sparsity. The model is a pruned version of the BERT base model.
Intended Use
The model can be used for fine-tuning to downstream tasks with sparsity already embeded to the model. To keep the sparsity a mask should be added to each sparse weight blocking the optimizer from updating the zeros.
Evaluation Results
We get the following results on the tasks development set, all results are mean of 5 different seeded models:
Task | MNLI-m (Acc) | MNLI-mm (Acc) | QQP (Acc/F1) | QNLI (Acc) | SST-2 (Acc) | STS-B (Pears/Spear) | SQuADv1.1 (Acc/F1) |
---|---|---|---|---|---|---|---|
83.3 | 83.9 | 90.8/87.6 | 90.4 | 91.3 | 88.8/88.3 | 80.5/88.2 |