lightonai
/

MonoQwen2-VL-v0.1

Model card Files Files and versions Community

uminaty commited on 3 days ago

Commit

c14099d

•

1 Parent(s): d3f2009

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ base_model:
 # MonoQwen2-VL-v0.1
 ## Model Overview
-The **MonoQwen2-VL-v0.1** is a LoRA of the Qwen2-VL-2B model, optimized for reranking (i.e, asserting pointwise image-query relevance) using the [MonoT5](https://arxiv.org/pdf/2101.05667) objective.
 That is, given a couple of image and query fed into the prompt of the VLM, the model is tasked to generate "True" if the image is relevant to the query and "False" otherwise.
 During inference, a relevancy score can then be obtained by comparing the logits of the two tokens and this score can effectively be used to rerank the candidates generated by a first-stage retriever (such as DSE or ColPali) or filter them using a threshold.

 # MonoQwen2-VL-v0.1
 ## Model Overview
+The **MonoQwen2-VL-v0.1** is a multimodal reranker finetuned with LoRA from [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct), optimized for asserting pointwise image-query relevance using the [MonoT5](https://arxiv.org/pdf/2101.05667) objective.
 That is, given a couple of image and query fed into the prompt of the VLM, the model is tasked to generate "True" if the image is relevant to the query and "False" otherwise.
 During inference, a relevancy score can then be obtained by comparing the logits of the two tokens and this score can effectively be used to rerank the candidates generated by a first-stage retriever (such as DSE or ColPali) or filter them using a threshold.