Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
m-ricย 
posted an update Oct 18
Post
859
๐—›๐—ผ๐˜„ ๐˜๐—ผ ๐—ฟ๐—ฒ-๐—ฟ๐—ฎ๐—ป๐—ธ ๐˜†๐—ผ๐˜‚๐—ฟ ๐˜€๐—ป๐—ถ๐—ฝ๐—ฝ๐—ฒ๐˜๐˜€ ๐—ถ๐—ป ๐—ฅ๐—”๐—š โ‡’ ColBERT, Rerankers, Cross-Encoders

Letโ€™s say youโ€™re doing RAG, and in an effort to improve performance, you try to rerank a few possible source snippets by their relevancy to a query.

How can you score similarity between your query and any source document? ๐Ÿค” ๐Ÿ“„ โ†”๏ธ ๐Ÿ“‘

๐Ÿญ. ๐—๐˜‚๐˜€๐˜ ๐˜‚๐˜€๐—ฒ ๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด๐˜€ : ๐—ก๐—ผ-๐—ถ๐—ป๐˜๐—ฒ๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐ŸŽ๏ธ

This means that you encode each token from both the query and the doc as separate vectors, then average the tokens of each separately to get in total 2 vectors, then you compute similarity via cosine or something.
โžก๏ธ Notable examples: Check the top of the MTEB leaderboard!

๐Ÿฎ. ๐—Ÿ๐—ฎ๐˜๐—ฒ-๐—ถ๐—ป๐˜๐—ฒ๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐—ผ๐—ป: ๐˜๐—ต๐—ถ๐˜€ ๐—ถ๐˜€ ๐—–๐—ผ๐—น๐—•๐—˜๐—ฅ๐—ง ๐Ÿƒ

These encode each token from both query and doc as separate vectors as before, but compare all together without previously averaging them and losing information.

This is more accurate than no-interaction but also slower because you have to compare n*m vectors instead of 2. At least you can store documents in memory. And ColBERT has some optimisations like pooling to be faster.

โžก๏ธ Notable examples: ColBERTv2, mxbai-colbert-large-v1, jina-colbert-v2

๐Ÿฏ. ๐—˜๐—ฎ๐—ฟ๐—น๐˜† ๐—ถ๐—ป๐˜๐—ฒ๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐—ผ๐—ป: ๐—–๐—ฟ๐—ผ๐˜€๐˜€-๐—ฒ๐—ป๐—ฐ๐—ผ๐—ฑ๐—ฒ๐—ฟ๐˜€ ๐Ÿ‹๏ธ

Basically you run the concatenated query + document in a model to get a final score.

This is the most accurate, but also the slowest since it gets really long when you have many docs to rerank! And you cannot pre-store embeddings.

โžก๏ธ Notable examples: MixedBread or Jina AI rerankers!

๐Ÿš€ So what you choose is a trade-off between speed and accuracy: I think ColBERT is often a really good choice!

Based on this great post by Jina AI ๐Ÿ‘‰ https://jina.ai/news/what-is-colbert-and-late-interaction-and-why-they-matter
In this post