leaderboard-pr-bot
commited on
Commit
•
d5a318d
1
Parent(s):
7cb7df3
Adding Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr
The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions
README.md
CHANGED
@@ -1,24 +1,31 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
-
library_name: peft
|
4 |
-
base_model: rishiraj/CatPPT-base
|
5 |
-
datasets:
|
6 |
-
- HuggingFaceH4/no_robots
|
7 |
language:
|
8 |
- en
|
9 |
-
|
10 |
-
|
11 |
-
<|system|>
|
12 |
-
You are a friendly chatbot who always responds in the style of a pirate</s>
|
13 |
-
<|user|>
|
14 |
-
How many helicopters can a human eat in one sitting?</s>
|
15 |
-
<|assistant|>
|
16 |
-
output:
|
17 |
-
text: >-
|
18 |
-
Aye, me hearties! 'Tis not likely a human can eat a helicopter in any sittin', let alone one! They be too big and made of metal, and not fit for consumption. But if ye be referrin' to helicopter snacks, like nuts and trail mix, then a human might be able to munch a goodly amount in one sittin'. Arr!
|
19 |
tags:
|
20 |
- generated_from_trainer
|
21 |
- merge
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
pipeline_tag: text-generation
|
23 |
model-index:
|
24 |
- name: CatPPT
|
@@ -121,4 +128,17 @@ The following hyperparameters were used during training:
|
|
121 |
journal = {Hugging Face repository},
|
122 |
howpublished = {\url{https://huggingface.co/rishiraj/CatPPT}}
|
123 |
}
|
124 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
+
license: apache-2.0
|
5 |
+
library_name: peft
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
tags:
|
7 |
- generated_from_trainer
|
8 |
- merge
|
9 |
+
datasets:
|
10 |
+
- HuggingFaceH4/no_robots
|
11 |
+
base_model: rishiraj/CatPPT-base
|
12 |
+
widget:
|
13 |
+
- text: '<|system|>
|
14 |
+
|
15 |
+
You are a friendly chatbot who always responds in the style of a pirate</s>
|
16 |
+
|
17 |
+
<|user|>
|
18 |
+
|
19 |
+
How many helicopters can a human eat in one sitting?</s>
|
20 |
+
|
21 |
+
<|assistant|>
|
22 |
+
|
23 |
+
'
|
24 |
+
output:
|
25 |
+
text: Aye, me hearties! 'Tis not likely a human can eat a helicopter in any sittin',
|
26 |
+
let alone one! They be too big and made of metal, and not fit for consumption.
|
27 |
+
But if ye be referrin' to helicopter snacks, like nuts and trail mix, then a
|
28 |
+
human might be able to munch a goodly amount in one sittin'. Arr!
|
29 |
pipeline_tag: text-generation
|
30 |
model-index:
|
31 |
- name: CatPPT
|
|
|
128 |
journal = {Hugging Face repository},
|
129 |
howpublished = {\url{https://huggingface.co/rishiraj/CatPPT}}
|
130 |
}
|
131 |
+
```
|
132 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
133 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_rishiraj__CatPPT)
|
134 |
+
|
135 |
+
| Metric |Value|
|
136 |
+
|---------------------------------|----:|
|
137 |
+
|Avg. |72.32|
|
138 |
+
|AI2 Reasoning Challenge (25-Shot)|68.09|
|
139 |
+
|HellaSwag (10-Shot) |86.69|
|
140 |
+
|MMLU (5-Shot) |65.16|
|
141 |
+
|TruthfulQA (0-shot) |61.55|
|
142 |
+
|Winogrande (5-shot) |81.61|
|
143 |
+
|GSM8k (5-shot) |70.81|
|
144 |
+
|