puneeshkhanna
commited on
Commit
•
680242d
1
Parent(s):
96b844d
Update README.md
Browse filesUpdate code evals
README.md
CHANGED
@@ -223,6 +223,19 @@ We report in the following table our internal pipeline benchmarks:
|
|
223 |
<td>74.2</td>
|
224 |
<td><b>86.3</td>
|
225 |
</tr>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
226 |
</tbody>
|
227 |
</table>
|
228 |
|
|
|
223 |
<td>74.2</td>
|
224 |
<td><b>86.3</td>
|
225 |
</tr>
|
226 |
+
<tr>
|
227 |
+
<td rowspan="2">Code</td>
|
228 |
+
<td>EvalPlus (0-shot) (avg)</td>
|
229 |
+
<td>69.4</td>
|
230 |
+
<td>58.9</td>
|
231 |
+
<td><b>74.7</b></td>
|
232 |
+
</tr>
|
233 |
+
<tr>
|
234 |
+
<td>Multipl-E (0-shot) (avg)</td>
|
235 |
+
<td>-</td>
|
236 |
+
<td>34.5</td>
|
237 |
+
<td><b>45.8</b></td>
|
238 |
+
</tr>
|
239 |
</tbody>
|
240 |
</table>
|
241 |
|