robinsmits
commited on
Commit
•
c25c91a
1
Parent(s):
cca8d18
Adding Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr
The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions
README.md
CHANGED
@@ -1,4 +1,6 @@
|
|
1 |
---
|
|
|
|
|
2 |
license: cc-by-nc-4.0
|
3 |
library_name: peft
|
4 |
tags:
|
@@ -8,15 +10,13 @@ tags:
|
|
8 |
- generated_from_trainer
|
9 |
- qwen2
|
10 |
base_model: Qwen/Qwen1.5-7B-Chat
|
11 |
-
model-index:
|
12 |
-
- name: Qwen1.5-7B-Dutch-Chat-Dpo
|
13 |
-
results: []
|
14 |
-
language:
|
15 |
-
- nl
|
16 |
datasets:
|
17 |
- BramVanroy/ultra_feedback_dutch_cleaned
|
18 |
pipeline_tag: text-generation
|
19 |
inference: false
|
|
|
|
|
|
|
20 |
---
|
21 |
|
22 |
# Qwen1.5-7B-Dutch-Chat-Dpo
|
@@ -144,4 +144,17 @@ Thanks to the creators of Qwen1.5 for there great work!
|
|
144 |
journal={arXiv preprint arXiv:2309.16609},
|
145 |
year={2023}
|
146 |
}
|
147 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- nl
|
4 |
license: cc-by-nc-4.0
|
5 |
library_name: peft
|
6 |
tags:
|
|
|
10 |
- generated_from_trainer
|
11 |
- qwen2
|
12 |
base_model: Qwen/Qwen1.5-7B-Chat
|
|
|
|
|
|
|
|
|
|
|
13 |
datasets:
|
14 |
- BramVanroy/ultra_feedback_dutch_cleaned
|
15 |
pipeline_tag: text-generation
|
16 |
inference: false
|
17 |
+
model-index:
|
18 |
+
- name: Qwen1.5-7B-Dutch-Chat-Dpo
|
19 |
+
results: []
|
20 |
---
|
21 |
|
22 |
# Qwen1.5-7B-Dutch-Chat-Dpo
|
|
|
144 |
journal={arXiv preprint arXiv:2309.16609},
|
145 |
year={2023}
|
146 |
}
|
147 |
+
```
|
148 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
149 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_robinsmits__Qwen1.5-7B-Dutch-Chat-Dpo)
|
150 |
+
|
151 |
+
| Metric |Value|
|
152 |
+
|---------------------------------|----:|
|
153 |
+
|Avg. |53.94|
|
154 |
+
|AI2 Reasoning Challenge (25-Shot)|50.77|
|
155 |
+
|HellaSwag (10-Shot) |74.24|
|
156 |
+
|MMLU (5-Shot) |60.70|
|
157 |
+
|TruthfulQA (0-shot) |42.37|
|
158 |
+
|Winogrande (5-shot) |68.11|
|
159 |
+
|GSM8k (5-shot) |27.45|
|
160 |
+
|