cross-encoder
/

ms-marco-TinyBERT-L-2

Text Classification

Inference Endpoints

Model card Files Files and versions Community

ms-marco-TinyBERT-L-2 / README.md

nreimers

upload

d87548d almost 4 years ago

|

2.41 kB

	# Cross-Encoder for MS Marco

	This model uses [BERT-Tiny](https://github.com/google-research/bert), a tiny BERT model with only 2 layers, 2 attention heads and 128 dimension size.

	It was trained on [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.

	The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Information Retrieval](https://github.com/UKPLab/sentence-transformers/tree/master/examples/applications/information-retrieval) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)

	## Usage and Performance

	Pre-trained models can be used like this:
	```
	from sentence_transformers import CrossEncoder
	model = CrossEncoder('model_name', max_length=512)
	scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])
	```

	In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.


	\| Model-Name \| NDCG@10 (TREC DL 19) \| MRR@10 (MS Marco Dev) \| Docs / Sec (BertTokenizerFast) \| Docs / Sec (Python Tokenizer) \|
	\| ------------- \|:-------------\| -----\| --- \| --- \|
	\| cross-encoder/ms-marco-TinyBERT-L-2 \| 67.43 \| 30.15 \| 9000 \| 780
	\| cross-encoder/ms-marco-TinyBERT-L-4 \| 68.09 \| 34.50 \| 2900 \| 760
	\| cross-encoder/ms-marco-TinyBERT-L-6 \| 69.57 \| 36.13 \| 680 \| 660
	\| cross-encoder/ms-marco-electra-base \| 71.99 \| 36.41 \| 340 \| 340
	\| Other models \| \| \| \|
	\| nboost/pt-tinybert-msmarco \| 63.63 \| 28.80 \| 2900 \| 760
	\| nboost/pt-bert-base-uncased-msmarco \| 70.94 \| 34.75 \| 340 \| 340\|
	\| nboost/pt-bert-large-msmarco \| 73.36 \| 36.48 \| 100 \| 100 \|
	\| Capreolus/electra-base-msmarco \| 71.23 \| \| 340 \| 340 \|
	\| amberoad/bert-multilingual-passage-reranking-msmarco \| 68.40 \| \| 330 \| 330

	Note: Runtime was computed on a V100 GPU. A bottleneck for smaller models is the standard Python tokenizer from Huggingface in version 3. Replacing it with the fast tokenizer based on Rust, the throughput is significantly improved: