Sparse BERT base model (uncased)

Pretrained model pruned to 1:2 structured sparsity. The model is a pruned version of the BERT base model.

Intended Use

The model can be used for fine-tuning to downstream tasks with sparsity already embeded to the model. To keep the sparsity a mask should be added to each sparse weight blocking the optimizer from updating the zeros.

Evaluation Results

We get the following results on the tasks development set, all results are mean of 5 different seeded models:

Task	MNLI-m (Acc)	MNLI-mm (Acc)	QQP (Acc/F1)	QNLI (Acc)	SST-2 (Acc)	STS-B (Pears/Spear)	SQuADv1.1 (Acc/F1)
	83.3	83.9	90.8/87.6	90.4	91.3	88.8/88.3	80.5/88.2

Downloads last month: 4

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including Intel/bert-base-uncased-sparse-1_2

BERT

Collection

BERT models of varying flavors • 26 items • Updated Aug 23, 2024