fdurant
/

colbert-xm-for-inference-api

Sentence Similarity

passage-retrieval

Inference Endpoints

Model card Files Files and versions Community

colbert-xm-for-inference-api / ADDITIONAL_README.md

fdurant's picture

add ADDITIONAL_README.md

acc9003 5 months ago

|

history blame contribute delete

1.12 kB

Multilingual Colbert embeddings as a service

Goal

Deploy Antoine Louis' colbert-xm as an inference service: text(s) in, vector(s) out

Motivation

use the service in a broader RAG solution

Steps followed

Clone the original repo following this procedure
Add a custom handler script as described here

Local development and testing

Build and start docker container hf_endpoints_emulator

See hf_endpoints_emulator

docker-compose up -d --build

This can take a few moments to load, given the size of the model (> 3 GB)!

How to test locally

./embed_single_query.sh
./embed_two_chunks.sh

docker-compose exec hf_endpoints_emulator pytest

Check output

docker-compose logs --follow hf_endpoints_emulator