intfloat
/

multilingual-e5-large

@@ -5985,7 +5985,7 @@ batch_dict = tokenizer(input_texts, max_length=512, padding=True, truncation=Tru
 outputs = model(**batch_dict)
 embeddings = average_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
-# (Optionally) normalize embeddings
 embeddings = F.normalize(embeddings, p=2, dim=1)
 scores = (embeddings[:2] @ embeddings[2:].T) * 100
 print(scores.tolist())
@@ -6037,11 +6037,61 @@ For all labeled datasets, we only use its training set for fine-tuning.
 For other training details, please refer to our paper at [https://arxiv.org/pdf/2212.03533.pdf](https://arxiv.org/pdf/2212.03533.pdf).
-## Benchmark Evaluation
 Check out [unilm/e5](https://github.com/microsoft/unilm/tree/master/e5) to reproduce evaluation results
 on the [BEIR](https://arxiv.org/abs/2104.08663) and [MTEB benchmark](https://arxiv.org/abs/2210.07316).
 ## Citation
 If you find our paper or models helpful, please consider cite as follows:

 outputs = model(**batch_dict)
 embeddings = average_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
+# normalize embeddings
 embeddings = F.normalize(embeddings, p=2, dim=1)
 scores = (embeddings[:2] @ embeddings[2:].T) * 100
 print(scores.tolist())
 For other training details, please refer to our paper at [https://arxiv.org/pdf/2212.03533.pdf](https://arxiv.org/pdf/2212.03533.pdf).
+## Benchmark Results on [Mr. TyDi](https://arxiv.org/abs/2108.08787)
+| Model                 | Avg MRR@10 |       | ar   | bn | en | fi | id | ja | ko | ru | sw   | te | th |
+|-----------------------|------------|-------|------| --- | --- | --- | --- | --- | --- | --- |------| --- | --- |
+| BM25                  | 33.3       | | 36.7 | 41.3 | 15.1 | 28.8 | 38.2 | 21.7 | 28.1 | 32.9 | 39.6 | 42.4 | 41.7 |
+| mDPR                  | 16.7       | | 26.0 | 25.8  | 16.2 | 11.3 | 14.6 | 18.1 | 21.9 | 18.5 | 7.3 | 10.6 | 13.5 |
+| BM25 + mDPR           | 41.7       | | 49.1 | 53.5 | 28.4 | 36.5 | 45.5 | 35.5 | 36.2 | 42.7 | 40.5 | 42.0 | 49.2 |
+|                       |            |
+| multilingual-e5-small | 64.4       | | 71.5 | 66.3 | 54.5 | 57.7 | 63.2 | 55.4 | 54.3 | 60.8 | 65.4 | 89.1 | 70.1 |
+| multilingual-e5-base  | 65.9       | | 72.3 | 65.0 | 58.5 | 60.8 | 64.9 | 56.6 | 55.8 | 62.7 | 69.0 | 86.6 | 72.7 |
+| multilingual-e5-large | **70.5**   | | 77.5 | 73.2 | 60.8 | 66.8 | 68.5 | 62.5 | 61.6 | 65.8 | 72.7 | 90.2 | 76.2 |
+## MTEB Benchmark Evaluation
 Check out [unilm/e5](https://github.com/microsoft/unilm/tree/master/e5) to reproduce evaluation results
 on the [BEIR](https://arxiv.org/abs/2104.08663) and [MTEB benchmark](https://arxiv.org/abs/2210.07316).
+## Support for Sentence Transformers
+Below is an example for usage with sentence_transformers.
+```python
+from sentence_transformers import SentenceTransformer
+model = SentenceTransformer('intfloat/multilingual-e5-large')
+input_texts = [
+    'query: how much protein should a female eat',
+    'query: 南瓜的家常做法',
+    "passage: As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 i     s 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or traini     ng for a marathon. Check out the chart below to see how much protein you should be eating each day.",
+    "passage: 1.清炒南瓜丝 原料:嫩南瓜半个 调料:葱、盐、白糖、鸡精 做法: 1、南瓜用刀薄薄的削去表面一层皮     ,用勺子刮去瓤 2、擦成细丝(没有擦菜板就用刀慢慢切成细丝) 3、锅烧热放油,入葱花煸出香味 4、入南瓜丝快速翻炒一分钟左右,     放盐、一点白糖和鸡精调味出锅 2.香葱炒南瓜 原料:南瓜1只 调料:香葱、蒜末、橄榄油、盐 做法: 1、将南瓜去皮,切成片 2、油     锅8成热后,将蒜末放入爆香 3、爆香后,将南瓜片放入,翻炒 4、在翻炒的同时,可以不时地往锅里加水,但不要太多 5、放入盐,炒匀      6、南瓜差不多软和绵了之后,就可以关火 7、撒入香葱,即可出锅"
+]
+embeddings = model.encode(input_texts, normalize_embeddings=True)
+```
+Package requirements
+`pip install sentence_transformers~=2.2.2`
+Contributors: [michaelfeil](https://huggingface.co/michaelfeil)
+## FAQ
+**1. Do I need to add the prefix "query: " and "passage: " to input texts?**
+Yes, this is how the model is trained, otherwise you will see a performance degradation.
+Here are some rules of thumb:
+- Use "query: " and "passage: " correspondingly for asymmetric tasks such as passage retrieval in open QA, ad-hoc information retrieval.
+- Use "query: " prefix for symmetric tasks such as semantic similarity, bitext mining, paraphrase retrieval.
+- Use "query: " prefix if you want to use embeddings as features, such as linear probing classification, clustering.
+**2. Why are my reproduced results slightly different from reported in the model card?**
+Different versions of `transformers` and `pytorch` could cause negligible but non-zero performance differences.
 ## Citation
 If you find our paper or models helpful, please consider cite as follows: