JunnanLi commited on
Commit
9ef4cc8
1 Parent(s): 31ba759

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -36,19 +36,19 @@ tags:
36
  | Aria | < HF link - TBD> | • Activation: 3.9B (3.5B MoE + 0.4B Visual Encoder) <br> • Total: 25.3B | 64K | -->
37
 
38
  ## Benchmark
39
- | Category | Benchmark | Aria | Pixtral 12B | Llama3.2 11B | GPT-4o mini | GPT-4o | Gemini-1.5 Flash | Gemini-1.5 Pro |
40
- |-------------------------------------|-------------------|-------|-------------|--------------|-------------|--------|------------------|----------------|
41
- | **Knowledge (Multimodal)** | MMMU | 54.9 | 52.5 | 49.6 | 59.4 | 69.1 | 56.1 | 62.2 |
42
- | **Math (Multimodal)** | MathVista | 66.1 | 58.0 | 51.5 | - | 54.7 | 63.8 | 58.4 |
43
- | **Document** | DocQA | 92.6 | 90.7 | 84.4 | - | 92.8 | 89.9 | 93.1 |
44
- | **Chart** | ChartQA | 86.4 | 81.8 | 78.7 | - | 85.7 | 85.4 | 87.2 |
45
- | **Scene Text** | TextVQA | 81.1 | - | 78.2 | - | - | 78.7 | 78.7 |
46
- | **General Visual QA** | MMBench-1.1 | 80.3 | - | - | 76.0 | 82.2 | - | 73.9 |
47
- | **Video Understanding** | LongVideoBench | 66.6 | 47.4 | 45.7 | 58.8 | 66.7 | 62.4 | 64.4 |
48
- | **Knowledge (Language)** | MMLU (5-shot) | 73.3 | 69.2 | 69.4 | - | 89.1 | 78.9 | 85.9 |
49
- | **Math (Language)** | MATH | 50.8 | 48.1 | 51.9 | 70.2 | 76.6 | - | - |
50
- | **Reasoning (Language)** | ARC Challenge | 91.0 | - | 83.4 | 96.4 | 96.7 | - | - |
51
- | **Coding** | HumanEval | 73.2 | 72.0 | 72.6 | 87.2 | 90.2 | 74.3 | 84.1 |
52
 
53
 
54
  ## Quick Start
 
36
  | Aria | < HF link - TBD> | • Activation: 3.9B (3.5B MoE + 0.4B Visual Encoder) <br> • Total: 25.3B | 64K | -->
37
 
38
  ## Benchmark
39
+ | Category | Benchmark | Aria | Pixtral 12B | Llama3.2 11B | GPT-4o mini | Gemini-1.5 Flash |
40
+ |-------------------------------------|-------------------|-------|-------------|--------------|-------------|------------------|
41
+ | **Knowledge (Multimodal)** | MMMU | 54.9 | 52.5 | 49.6 | 59.4 | 56.1 |
42
+ | **Math (Multimodal)** | MathVista | 66.1 | 58.0 | 51.5 | - | 63.8 |
43
+ | **Document** | DocQA | 92.6 | 90.7 | 84.4 | - | 89.9 |
44
+ | **Chart** | ChartQA | 86.4 | 81.8 | 78.7 | - | 85.4 |
45
+ | **Scene Text** | TextVQA | 81.1 | - | 78.2 | - | 78.7 |
46
+ | **General Visual QA** | MMBench-1.1 | 80.3 | - | - | 76.0 | - |
47
+ | **Video Understanding** | LongVideoBench | 66.6 | 47.4 | 45.7 | 58.8 | 62.4 |
48
+ | **Knowledge (Language)** | MMLU (5-shot) | 73.3 | 69.2 | 69.4 | - | 78.9 |
49
+ | **Math (Language)** | MATH | 50.8 | 48.1 | 51.9 | 70.2 | - |
50
+ | **Reasoning (Language)** | ARC Challenge | 91.0 | - | 83.4 | 96.4 | - |
51
+ | **Coding** | HumanEval | 73.2 | 72.0 | 72.6 | 87.2 | 74.3 |
52
 
53
 
54
  ## Quick Start