leaderboard-pr-bot commited on
Commit
51b05e5
1 Parent(s): 4296bda

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +120 -4
README.md CHANGED
@@ -1,10 +1,8 @@
1
  ---
2
- license: other
3
- license_name: gemma-terms-of-use
4
- license_link: https://ai.google.dev/gemma/terms
5
  language:
6
  - de
7
  - en
 
8
  tags:
9
  - sft
10
  - dpo
@@ -13,6 +11,111 @@ tags:
13
  - finetune
14
  - work in progress
15
  - alpha
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
  **Update**
18
  - 01.03.2024 - Reuploaded the model in bfloat16 dtype.
@@ -207,4 +310,17 @@ If you are interested in customized LLMs for business applications, please get i
207
  We are also keenly seeking support and investment for our startups, VAGO solutions and Hyperspace where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.de/#Kontakt), [Hyperspace.computer](https://hyperspace.computer/)
208
 
209
  ## Acknowledgement
210
- Many thanks to [google](https://huggingface.co/google) for providing such valuable model to the Open-Source community
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
2
  language:
3
  - de
4
  - en
5
+ license: other
6
  tags:
7
  - sft
8
  - dpo
 
11
  - finetune
12
  - work in progress
13
  - alpha
14
+ license_name: gemma-terms-of-use
15
+ license_link: https://ai.google.dev/gemma/terms
16
+ model-index:
17
+ - name: SauerkrautLM-Gemma-7b
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: AI2 Reasoning Challenge (25-Shot)
24
+ type: ai2_arc
25
+ config: ARC-Challenge
26
+ split: test
27
+ args:
28
+ num_few_shot: 25
29
+ metrics:
30
+ - type: acc_norm
31
+ value: 59.98
32
+ name: normalized accuracy
33
+ source:
34
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Gemma-7b
35
+ name: Open LLM Leaderboard
36
+ - task:
37
+ type: text-generation
38
+ name: Text Generation
39
+ dataset:
40
+ name: HellaSwag (10-Shot)
41
+ type: hellaswag
42
+ split: validation
43
+ args:
44
+ num_few_shot: 10
45
+ metrics:
46
+ - type: acc_norm
47
+ value: 81.91
48
+ name: normalized accuracy
49
+ source:
50
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Gemma-7b
51
+ name: Open LLM Leaderboard
52
+ - task:
53
+ type: text-generation
54
+ name: Text Generation
55
+ dataset:
56
+ name: MMLU (5-Shot)
57
+ type: cais/mmlu
58
+ config: all
59
+ split: test
60
+ args:
61
+ num_few_shot: 5
62
+ metrics:
63
+ - type: acc
64
+ value: 63.76
65
+ name: accuracy
66
+ source:
67
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Gemma-7b
68
+ name: Open LLM Leaderboard
69
+ - task:
70
+ type: text-generation
71
+ name: Text Generation
72
+ dataset:
73
+ name: TruthfulQA (0-shot)
74
+ type: truthful_qa
75
+ config: multiple_choice
76
+ split: validation
77
+ args:
78
+ num_few_shot: 0
79
+ metrics:
80
+ - type: mc2
81
+ value: 61.0
82
+ source:
83
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Gemma-7b
84
+ name: Open LLM Leaderboard
85
+ - task:
86
+ type: text-generation
87
+ name: Text Generation
88
+ dataset:
89
+ name: Winogrande (5-shot)
90
+ type: winogrande
91
+ config: winogrande_xl
92
+ split: validation
93
+ args:
94
+ num_few_shot: 5
95
+ metrics:
96
+ - type: acc
97
+ value: 76.64
98
+ name: accuracy
99
+ source:
100
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Gemma-7b
101
+ name: Open LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: GSM8k (5-shot)
107
+ type: gsm8k
108
+ config: main
109
+ split: test
110
+ args:
111
+ num_few_shot: 5
112
+ metrics:
113
+ - type: acc
114
+ value: 63.68
115
+ name: accuracy
116
+ source:
117
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Gemma-7b
118
+ name: Open LLM Leaderboard
119
  ---
120
  **Update**
121
  - 01.03.2024 - Reuploaded the model in bfloat16 dtype.
 
310
  We are also keenly seeking support and investment for our startups, VAGO solutions and Hyperspace where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.de/#Kontakt), [Hyperspace.computer](https://hyperspace.computer/)
311
 
312
  ## Acknowledgement
313
+ Many thanks to [google](https://huggingface.co/google) for providing such valuable model to the Open-Source community
314
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
315
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_VAGOsolutions__SauerkrautLM-Gemma-7b)
316
+
317
+ | Metric |Value|
318
+ |---------------------------------|----:|
319
+ |Avg. |67.83|
320
+ |AI2 Reasoning Challenge (25-Shot)|59.98|
321
+ |HellaSwag (10-Shot) |81.91|
322
+ |MMLU (5-Shot) |63.76|
323
+ |TruthfulQA (0-shot) |61.00|
324
+ |Winogrande (5-shot) |76.64|
325
+ |GSM8k (5-shot) |63.68|
326
+