File size: 1,863 Bytes

67bb0e9

---
license: apache-2.0
language:
- en
metrics:
- accuracy
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: visual-question-answering
---

# MMEvol Model Card

## Model Details

Here are the pretrained weights and instruction tuning weights
| Model            | Pretrained Projector | Base LLM  | PT Data                                                      | IT Data | Download |
| ---------------- | -------------------- | --------- | ------------------------------------------------------------ | ------- | -------- |
| MMEvol-LLaMA3-8B | [mm_projector](https://huggingface.co/Tongyi-ConvAI/MMEvol/tree/main/llama3) | LLaMA3-8B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/Tongyi-ConvAI/MMEvol/tree/main/llama3)|

## Training dataset
- [480k MMEvol Curated Instruction Tuning Data](https://huggingface.co/datasets/Tongyi-ConvAI/MMEvol).

## Performance

### VLMEvalKit Support (OpenCompass)

| Model            | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA |
| ---------------- | ----- | ------ | --------- | -------------- | -------- | ---- | ---- | ----- | ---- |
| MMEvol-LLaMA3-8B | 47.8  | 50.1   | 62.3      | 50.0           | 40.8     | 73.9 | 86.8 | 46.4  | 62.6 |

### VLMEvalKit Not Support (VQADataSet)

| Model            | VQA_v2 | GQA  | MIA  | MMSInst |
| ---------------- | ------ | ---- | ---- | ------- |
| MMEvol-LLaMA3-8B | 83.4   | 65.0 | 78.8 | 32.3    |


## Paper or resources for more information
- Page: https://mmevol.github.io/
- arXiv: https://arxiv.org/pdf/2409.05840

## License
Llama 3 is licensed under the LLAMA 3 Community License, 
Copyright (c) Meta Platforms, Inc. All Rights Reserved.

## Contact us if you have any questions

- Run Luo — r.luo@siat.ac.cn
- Haonan Zhang — zchiowal@gmail.com