angelahzyuan
commited on
Commit
•
97252e2
1
Parent(s):
1a827d7
Update README.md
Browse files
README.md
CHANGED
@@ -36,9 +36,9 @@ This model was developed using [Self-Play Preference Optimization](https://arxiv
|
|
36 |
| Mistral7B-PairRM-SPPO Iter 1 | 24.79 | 23.51 | 1855 |
|
37 |
| Mistral7B-PairRM-SPPO Iter 2 | 26.89 | 27.62 | 2019 |
|
38 |
| Mistral7B-PairRM-SPPO Iter 3 | 28.53 | 31.02 | 2163 |
|
39 |
-
| Mistral7B-PairRM-SPPO Iter 1 (best-of-16) |
|
40 |
-
| Mistral7B-PairRM-SPPO Iter 2 (best-of-16) |
|
41 |
-
| Mistral7B-PairRM-SPPO Iter 3 (best-of-16) |
|
42 |
|
43 |
## [Arena-Hard Evaluation Results](https://github.com/lm-sys/arena-hard)
|
44 |
|
|
|
36 |
| Mistral7B-PairRM-SPPO Iter 1 | 24.79 | 23.51 | 1855 |
|
37 |
| Mistral7B-PairRM-SPPO Iter 2 | 26.89 | 27.62 | 2019 |
|
38 |
| Mistral7B-PairRM-SPPO Iter 3 | 28.53 | 31.02 | 2163 |
|
39 |
+
| Mistral7B-PairRM-SPPO Iter 1 (best-of-16) | 28.71 | 27.77 | 1901 |
|
40 |
+
| Mistral7B-PairRM-SPPO Iter 2 (best-of-16) | 31.23 | 32.12 | 2035 |
|
41 |
+
| Mistral7B-PairRM-SPPO Iter 3 (best-of-16) | 32.13 | 34.94 | 2174 |
|
42 |
|
43 |
## [Arena-Hard Evaluation Results](https://github.com/lm-sys/arena-hard)
|
44 |
|