File size: 893 Bytes
eed9b35
 
cd4bdab
57ab27b
 
 
 
 
 
 
eed9b35
 
aba35df
eed9b35
57ab27b
eed9b35
57ab27b
eed9b35
57ab27b
eed9b35
57ab27b
eed9b35
57ab27b
 
 
 
 
eed9b35
57ab27b
eed9b35
57ab27b
eed9b35
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
library_name: transformers
base_model: microsoft/Florence-2-base-ft
tags:
- finetune
- image-to-text
- VQA
- VLM
language:
- en
---

# Model Details

# Visual Question Answering Model

This model is a fine-tuned version of `microsoft/Florence-2-base-ft` designed for Visual Question Answering (VQA). It has been optimized for tasks where the model interprets images and responds to questions about the visual content.

---

### Model Details

- **Finetuned by:** prithivMLmods
- **Model type:** Visual Question Answering (VQA)
- **Language(s):** English (NLP component)
- **License:** None specified
- **Finetuned from model:** [microsoft/Florence-2-base-ft](https://huggingface.co/microsoft/Florence-2-base-ft)

### Usage

This model can be used to perform VQA tasks, where it takes an image and a question about the image as input, and returns an answer based on the visual content.