BERT
Collection
BERT models of varying flavors
•
26 items
•
Updated
This model is a result of fine-tuning a Prune OFA 90% sparse pre-trained BERT-Large combined with knowledge distillation.
This model yields the following results on SQuADv1.1 development set:
{"exact_match": 83.56669820245979, "f1": 90.20829352733487}
For further details see our paper, Prune Once for All: Sparse Pre-Trained Language Models, and our open source implementation available here.