prithivMLmods's picture
Update README.md
cd4bdab verified
|
raw
history blame
893 Bytes
---
library_name: transformers
base_model: microsoft/Florence-2-base-ft
tags:
- finetune
- image-to-text
- VQA
- VLM
language:
- en
---
# Model Details
# Visual Question Answering Model
This model is a fine-tuned version of `microsoft/Florence-2-base-ft` designed for Visual Question Answering (VQA). It has been optimized for tasks where the model interprets images and responds to questions about the visual content.
---
### Model Details
- **Finetuned by:** prithivMLmods
- **Model type:** Visual Question Answering (VQA)
- **Language(s):** English (NLP component)
- **License:** None specified
- **Finetuned from model:** [microsoft/Florence-2-base-ft](https://huggingface.co/microsoft/Florence-2-base-ft)
### Usage
This model can be used to perform VQA tasks, where it takes an image and a question about the image as input, and returns an answer based on the visual content.