CLIP-FlanT5-XL (VQAScore)
This model is a fine-tuned version of google/flan-t5-xl designed for image-text retrieval tasks, as presented in the VQAScore paper.
Model Description
- Developed by: Zhiqiu Lin and collaborators
- Model type: Vision-Language Generative Model
- License: Apache-2.0
- Finetuned from model: google/flan-t5-xxl
Model Sources [optional]
- Downloads last month
- 4,836
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for zhiqiulin/clip-flant5-xl
Base model
google/flan-t5-xl