Safetensors
English
qwen2_vl
biology
medical
chemistry
AdaptLLM commited on
Commit
5d52c2c
·
verified ·
1 Parent(s): 1c5f335

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -2
README.md CHANGED
@@ -26,7 +26,30 @@ We investigate domain adaptation of MLLMs through post-training, focusing on dat
26
  <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/-Jp7pAsCR2Tj4WwfwsbCo.png" width="600">
27
  </p>
28
 
29
- ## How to use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  1. Set up
31
  ```bash
32
  pip install qwen-vl-utils
@@ -57,6 +80,8 @@ processor = AutoProcessor.from_pretrained("AdaptLLM/medicine-Qwen2-VL-2B-Instruc
57
  # max_pixels = 1280*28*28
58
  # processor = AutoProcessor.from_pretrained("AdaptLLM/medicine-Qwen2-VL-2B-Instruct", min_pixels=min_pixels, max_pixels=max_pixels)
59
 
 
 
60
  messages = [
61
  {
62
  "role": "user",
@@ -94,8 +119,13 @@ output_text = processor.batch_decode(
94
  )
95
  print(output_text)
96
  ```
 
 
 
 
 
 
97
 
98
- Since our model architecture aligns with the base model, you can refer to the official repository of [Qwen-2-VL](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct/edit/main/README.md) for more advanced usage instructions.
99
 
100
  ## Citation
101
  If you find our work helpful, please cite us.
 
26
  <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/-Jp7pAsCR2Tj4WwfwsbCo.png" width="600">
27
  </p>
28
 
29
+ ## Resources
30
+ **🤗 We share our data and models with example usages, feel free to open any issues or discussions! 🤗**
31
+
32
+ | Model | Repo ID in HF 🤗 | Domain | Base Model | Training Data | Evaluation Benchmark |
33
+ |:----------------------------------------------------------------------------|:--------------------------------------------|:--------------|:-------------------------|:------------------------------------------------------------------------------------------------|-----------------------|
34
+ | [Visual Instruction Synthesizer](https://huggingface.co/AdaptLLM/visual-instruction-synthesizer) | AdaptLLM/visual-instruction-synthesizer | - | open-llava-next-llama3-8b | VisionFLAN and ALLaVA | - |
35
+ | [AdaMLLM-med-2B](https://huggingface.co/AdaptLLM/biomed-Qwen2-VL-2B-Instruct) | AdaptLLM/biomed-Qwen2-VL-2B-Instruct | Biomedicine | Qwen2-VL-2B-Instruct | [biomed-visual-instructions](https://huggingface.co/datasets/AdaptLLM/biomed-visual-instructions) | [biomed-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/biomed-VQA-benchmark) |
36
+ | [AdaMLLM-food-2B](https://huggingface.co/AdaptLLM/food-Qwen2-VL-2B-Instruct) | AdaptLLM/food-Qwen2-VL-2B-Instruct | Food | Qwen2-VL-2B-Instruct | [food-visual-instructions](https://huggingface.co/datasets/AdaptLLM/food-visual-instructions) | [food-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/food-VQA-benchmark) |
37
+ | [AdaMLLM-med-8B](https://huggingface.co/AdaptLLM/biomed-LLaVA-NeXT-Llama3-8B) | AdaptLLM/biomed-LLaVA-NeXT-Llama3-8B | Biomedicine | open-llava-next-llama3-8b | [biomed-visual-instructions](https://huggingface.co/datasets/AdaptLLM/biomed-visual-instructions) | [biomed-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/biomed-VQA-benchmark) |
38
+ | [AdaMLLM-food-8B](https://huggingface.co/AdaptLLM/food-LLaVA-NeXT-Llama3-8B) |AdaptLLM/food-LLaVA-NeXT-Llama3-8B | Food | open-llava-next-llama3-8b | [food-visual-instructions](https://huggingface.co/datasets/AdaptLLM/food-visual-instructions) | [food-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/food-VQA-benchmark) |
39
+ | [AdaMLLM-med-11B](https://huggingface.co/AdaptLLM/biomed-Llama-3.2-11B-Vision-Instruct) | AdaptLLM/biomed-Llama-3.2-11B-Vision-Instruct | Biomedicine | Llama-3.2-11B-Vision-Instruct | [biomed-visual-instructions](https://huggingface.co/datasets/AdaptLLM/biomed-visual-instructions) | [biomed-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/biomed-VQA-benchmark) |
40
+ | [AdaMLLM-food-11B](https://huggingface.co/AdaptLLM/food-Llama-3.2-11B-Vision-Instruct) | AdaptLLM/food-Llama-3.2-11B-Vision-Instruct | Food | Llama-3.2-11B-Vision-Instruct | [food-visual-instructions](https://huggingface.co/datasets/AdaptLLM/food-visual-instructions) | [food-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/food-VQA-benchmark) |
41
+
42
+ **Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
43
+
44
+ ## 1. To Chat with AdaMLLM
45
+
46
+ Our model architecture aligns with the base model: Qwen-2-VL-Instruct. Below, we provide a usage example. For more advanced usage instructions, please refer to the official [Qwen-2-VL-Instruct repository](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct/edit/main/README.md).
47
+
48
+ **Note:** For AdaMLLM, always place the image at the beginning of the input instruction in the messages.
49
+
50
+ <details>
51
+ <summary> Click to expand </summary>
52
+
53
  1. Set up
54
  ```bash
55
  pip install qwen-vl-utils
 
80
  # max_pixels = 1280*28*28
81
  # processor = AutoProcessor.from_pretrained("AdaptLLM/medicine-Qwen2-VL-2B-Instruct", min_pixels=min_pixels, max_pixels=max_pixels)
82
 
83
+
84
+ # NOTE: For AdaMLLM, always place the image at the beginning of the input instruction in the messages.
85
  messages = [
86
  {
87
  "role": "user",
 
119
  )
120
  print(output_text)
121
  ```
122
+ </details>
123
+
124
+ ## 2. To Evaluate AdaMLLM on Domain-Specific Benchmarks
125
+
126
+ Refer to the [biomed-VQA-benchmark](https://huggingface.co/datasets/AdaptLLM/biomed-VQA-benchmark) to reproduce our results and evaluate many other MLLMs on domain-specific benchmarks.
127
+
128
 
 
129
 
130
  ## Citation
131
  If you find our work helpful, please cite us.