alpayariyak
commited on
Commit
•
dfcf6be
1
Parent(s):
4626593
Update README.md
Browse files
README.md
CHANGED
@@ -201,6 +201,7 @@ Score 5: {orig_score5_description}
|
|
201 |
|
202 |
All models are evaluated in chat mode (e.g. with the respective conversation template applied). All zero-shot benchmarks follow the same setting as in the AGIEval paper and Orca paper. CoT tasks use the same configuration as Chain-of-Thought Hub, HumanEval is evaluated with EvalPlus, and MT-bench is run using FastChat. To reproduce our results, follow the instructions in [our repository](https://github.com/imoneoi/openchat/#benchmarks).
|
203 |
|
|
|
204 |
</details>
|
205 |
<div>
|
206 |
<h3>HumanEval+</h3>
|
|
|
201 |
|
202 |
All models are evaluated in chat mode (e.g. with the respective conversation template applied). All zero-shot benchmarks follow the same setting as in the AGIEval paper and Orca paper. CoT tasks use the same configuration as Chain-of-Thought Hub, HumanEval is evaluated with EvalPlus, and MT-bench is run using FastChat. To reproduce our results, follow the instructions in [our repository](https://github.com/imoneoi/openchat/#benchmarks).
|
203 |
|
204 |
+
|
205 |
</details>
|
206 |
<div>
|
207 |
<h3>HumanEval+</h3>
|