Fine Tuning Jina Embedding V3 for classification task

#77
by mumeranwaar - opened

Is there any blog post available for Fine Tuning Jina Embedding V3 for classification task. (only lora as well as full model)?

I could not find any, kindly guide.

Hey, I don't think we have a blogpost I can point you to that details fine-tuning jina-embeddings-v3, but it's relatively straightforward to do with ST. Just make sure to set the default_task to train a LoRA adapter, like 'classification'.

model = SentenceTransformer("jinaai/jina-embeddings-v3", trust_remote_code=True, model_kwargs={'default_task': 'classification'})
model = SentenceTransformer("jinaai/jina-embeddings-v3", trust_remote_code=True)

Or

model[0].default_task = 'classification'

You can also choose to fine-tune the main parameters (non-lora parameters) by setting https://huggingface.co/jinaai/jina-embeddings-v3/blob/main/config.json#L27 to True, and then not passing a default_task.

Hope this helps!

Jina AI org

@mumeranwaar You can also take a look at this blogpost: https://jina.ai/news/jina-classifier-for-high-performance-zero-shot-and-few-shot-classification/
Our zero-shot / few-shot classifier API might be just what you're looking for with your classification problem, though it does not detail how to finetune our v3 model.

Sign up or log in to comment