File size: 1,006 Bytes
d6813d2 ffa861d d6813d2 c9c76df d6813d2 0067b5d d6813d2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
language:
- en
base_model:
- google/gemma-2-9b-it
---
# Gemma Embeddings v0.8
GemmaEmbed is a dense-vector embedding model, trained especially for retrieval. As of December 2, 2024, GemmaEmbed achieves the #1 position overall on the _MTEB Retrieval_ leaderboard, with a score of 63.80.
# Important Notes
* This is not an official Google product.
* This is a research project.
# Results summary
Results compared to BGE-EN-ICL on several large datasets
Model | DBPedia | FEVER | HotPotQA | MSMARCO | NQ |
------ | --------- | ------ | ------- | ------- | ------ |
BGE-EN-ICL | 51.63 | 92.83 | 85.14 | 46.79 | 73.88 |
Gemma-Embeddings-v0.8 | 52.58 | 93.50 | 87.58 | 47.13 | 74.45 |
# Model & Data
Our base encoder model is [Gemma2 9B](https://huggingface.co/google/gemma-2-9b).
We use the [BGE-EN-ICL training data](https://huggingface.co/datasets/cfli/bge-full-data).
# Research Team
* Nicholas Monath
* Michael Boratko
* Seungyeon Kim
* Andrew McCallum
* Rob Fergus
* Manzil Zaheer
|