Add relevant metadata, link to code and paper
Browse filesThis PR adds the `library_name` and `pipeline_tag` for this model card, so that it can be found using the filters on the hub.
It also links the model to the corresponding Github repository and paper page.
README.md
CHANGED
@@ -1,8 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# <img src="assets/icon.png" width="35" /> ReFocus
|
2 |
|
3 |
This repo contains the model for the paper "ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding"
|
4 |
|
5 |
-
[**π Homepage**](https://zeyofu.github.io/ReFocus/) |[**π Paper**](https://
|
6 |
|
7 |
|
8 |
# Introduction
|
@@ -22,8 +27,4 @@ This model is finetuned based on Phi-3.5-vision, and we used the following promp
|
|
22 |
To enforce the model to generate bounding box coordinates to refocus, you could try this prompt:
|
23 |
```
|
24 |
<|image_1|>\n{question}\nThought: The areas to focus on in the image have bounding box coordinates:
|
25 |
-
```
|
26 |
-
|
27 |
-
---
|
28 |
-
license: apache-2.0
|
29 |
-
---
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
library_name: transformers
|
4 |
+
pipeline_tag: image-text-to-text
|
5 |
+
---
|
6 |
# <img src="assets/icon.png" width="35" /> ReFocus
|
7 |
|
8 |
This repo contains the model for the paper "ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding"
|
9 |
|
10 |
+
[**π Homepage**](https://zeyofu.github.io/ReFocus/) |[**π Paper**](https://huggingface.co/papers/2501.05452) | [**π Code**](https://github.com/zeyofu/ReFocus_Code)
|
11 |
|
12 |
|
13 |
# Introduction
|
|
|
27 |
To enforce the model to generate bounding box coordinates to refocus, you could try this prompt:
|
28 |
```
|
29 |
<|image_1|>\n{question}\nThought: The areas to focus on in the image have bounding box coordinates:
|
30 |
+
```
|
|
|
|
|
|
|
|