zijian.kang
commited on
Commit
·
5b88afc
1
Parent(s):
b59cd22
slight adjust readme
Browse files
README.md
CHANGED
@@ -20,13 +20,13 @@ In a word, SAIL-VL is a foundational VLM for vision-language applications. Welco
|
|
20 |
|
21 |
## Model Card
|
22 |
|
23 |
-
Model Architecture:
|
24 |
|
25 |
| Architecture | ViT | LLM | Adapter | Token Merge | Resolution |
|
26 |
| --- | --- | --- | --- | --- | --- |
|
27 |
| SAIL-VL-2B | [🤗InternViT-300M](https://huggingface.co/OpenGVLab/InternViT-300M-448px) | [🤗Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) | 2-layer MLP | 2x2 | 448x448xN |
|
28 |
|
29 |
-
Training
|
30 |
|
31 |
Sail-VL benefits from high-quality data and carefully curated training recipes. We find the data quality, quantity and the design of curriculum training pipeline is crucial for model performance. With the proper design and data, the model's capacity scales effectively with data expansion at all stages, leading to enhanced performance. More details will be released soon.
|
32 |
|
@@ -89,14 +89,14 @@ We visualize some of examples from LLaVA-Bench to show the capabilities of our m
|
|
89 |
|
90 |
## How to Use
|
91 |
|
92 |
-
The basic usage and dynamic crop strategy of SAIL-VL follows InternVL2, you can easily switch Intern-VL series models to our model. Here is a simple example of using our model:
|
93 |
|
94 |
-
Requirements:
|
95 |
```
|
96 |
pip3 install einops transformers timm
|
97 |
```
|
98 |
|
99 |
-
Code:
|
100 |
|
101 |
```Python
|
102 |
import numpy as np
|
|
|
20 |
|
21 |
## Model Card
|
22 |
|
23 |
+
### Model Architecture:
|
24 |
|
25 |
| Architecture | ViT | LLM | Adapter | Token Merge | Resolution |
|
26 |
| --- | --- | --- | --- | --- | --- |
|
27 |
| SAIL-VL-2B | [🤗InternViT-300M](https://huggingface.co/OpenGVLab/InternViT-300M-448px) | [🤗Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) | 2-layer MLP | 2x2 | 448x448xN |
|
28 |
|
29 |
+
### Training Recipes Overview:
|
30 |
|
31 |
Sail-VL benefits from high-quality data and carefully curated training recipes. We find the data quality, quantity and the design of curriculum training pipeline is crucial for model performance. With the proper design and data, the model's capacity scales effectively with data expansion at all stages, leading to enhanced performance. More details will be released soon.
|
32 |
|
|
|
89 |
|
90 |
## How to Use
|
91 |
|
92 |
+
The basic usage and dynamic crop strategy of SAIL-VL follows InternVL2, you can easily switch Intern-VL series of models to our model. Here is a simple example of using our model:
|
93 |
|
94 |
+
### Requirements:
|
95 |
```
|
96 |
pip3 install einops transformers timm
|
97 |
```
|
98 |
|
99 |
+
### Code:
|
100 |
|
101 |
```Python
|
102 |
import numpy as np
|