|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
pipeline_tag: image-text-to-text |
|
--- |
|
|
|
# MMEvol Model Card |
|
|
|
## Model Details |
|
|
|
Here are the pretrained weights and instruction tuning weights |
|
| Model | Pretrained Projector | Base LLM | PT Data | IT Data | Download | |
|
| ---------------- | -------------------- | --------- | ------------------------------------------------------------ | ------- | -------- | |
|
| MMEvol-LLaMA3-8B | [mm_projector](https://huggingface.co/Tongyi-ConvAI/MMEvol/tree/main/llama3) | LLaMA3-8B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/Tongyi-ConvAI/MMEvol/tree/main/llama3)| |
|
|
|
## Performance |
|
|
|
### VLMEvalKit Support (OpenCompass) |
|
|
|
| Model | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA | |
|
| ---------------- | ----- | ------ | --------- | -------------- | -------- | ---- | ---- | ----- | ---- | |
|
| MMEvol-LLaMA3-8B | 47.8 | 50.1 | 62.3 | 50.0 | 40.8 | 73.9 | 86.8 | 46.4 | 62.6 | |
|
|
|
### VLMEvalKit Not Support (VQADataSet) |
|
|
|
| Model | VQA_v2 | GQA | MIA | MMSInst | |
|
| ---------------- | ------ | ---- | ---- | ------- | |
|
| MMEvol-LLaMA3-8B | 83.4 | 65.0 | 78.8 | 32.3 | |
|
|
|
|
|
## Paper or resources for more information |
|
- Page: https://mmevol.github.io/ |
|
- arXiv: https://arxiv.org/pdf/2409.05840 |
|
|
|
## License |
|
Llama 3 is licensed under the LLAMA 3 Community License, |
|
Copyright (c) Meta Platforms, Inc. All Rights Reserved. |
|
|
|
## Contact us if you have any questions |
|
|
|
- Run Luo — r.luo@siat.ac.cn |
|
- Haonan Zhang — zchiowal@gmail.com |