puneeshkhanna commited on
Commit
b61e7ea
1 Parent(s): 0c47f89

Update README.md

Browse files

Update gemma2-9b results

Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -86,21 +86,21 @@ We report in the following table our internal pipeline benchmarks:
86
  <td>MMLU (5-shot)</td>
87
  <td>65.2</td>
88
  <td>74.2</td>
89
- <td>-</td>
90
  <td>67.5</td>
91
  </tr>
92
  <tr>
93
  <td>MMLU-PRO (5-shot)</td>
94
  <td>32.7</td>
95
  <td>43.5</td>
96
- <td>-</td>
97
  <td>39.2</td>
98
  </tr>
99
  <tr>
100
  <td>IFEval</td>
101
  <td>12.0</td>
102
  <td>33.9</td>
103
- <td>-</td>
104
  <td>34.3</td>
105
  </tr>
106
  <tr>
@@ -115,7 +115,7 @@ We report in the following table our internal pipeline benchmarks:
115
  <td>MATH(4-shot)</td>
116
  <td>4.1</td>
117
  <td>15.5</td>
118
- <td>-</td>
119
  <td>18.0</td>
120
  </tr>
121
  <tr>
@@ -130,21 +130,21 @@ We report in the following table our internal pipeline benchmarks:
130
  <td>GPQA (0-shot)</td>
131
  <td>31.0</td>
132
  <td>33.0</td>
133
- <td>-</td>
134
  <td>35.5</td>
135
  </tr>
136
  <tr>
137
  <td>MUSR (0-shot)</td>
138
  <td>38.0</td>
139
  <td>44.2</td>
140
- <td>-</td>
141
  <td>47.3</td>
142
  </tr>
143
  <tr>
144
  <td>BBH (3-shot)</td>
145
  <td>46.5</td>
146
  <td>54.0</td>
147
- <td>-</td>
148
  <td>51.0</td>
149
  </tr>
150
  <tr>
 
86
  <td>MMLU (5-shot)</td>
87
  <td>65.2</td>
88
  <td>74.2</td>
89
+ <td>70.8</td>
90
  <td>67.5</td>
91
  </tr>
92
  <tr>
93
  <td>MMLU-PRO (5-shot)</td>
94
  <td>32.7</td>
95
  <td>43.5</td>
96
+ <td>41.4</td>
97
  <td>39.2</td>
98
  </tr>
99
  <tr>
100
  <td>IFEval</td>
101
  <td>12.0</td>
102
  <td>33.9</td>
103
+ <td>21.2</td>
104
  <td>34.3</td>
105
  </tr>
106
  <tr>
 
115
  <td>MATH(4-shot)</td>
116
  <td>4.1</td>
117
  <td>15.5</td>
118
+ <td>10.5</td>
119
  <td>18.0</td>
120
  </tr>
121
  <tr>
 
130
  <td>GPQA (0-shot)</td>
131
  <td>31.0</td>
132
  <td>33.0</td>
133
+ <td>33.4</td>
134
  <td>35.5</td>
135
  </tr>
136
  <tr>
137
  <td>MUSR (0-shot)</td>
138
  <td>38.0</td>
139
  <td>44.2</td>
140
+ <td>45.3</td>
141
  <td>47.3</td>
142
  </tr>
143
  <tr>
144
  <td>BBH (3-shot)</td>
145
  <td>46.5</td>
146
  <td>54.0</td>
147
+ <td>54.3</td>
148
  <td>51.0</td>
149
  </tr>
150
  <tr>