leaderboard-pr-bot commited on
Commit
fbc911e
β€’
1 Parent(s): 68d6f7b

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +121 -5
README.md CHANGED
@@ -1,13 +1,116 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
3
  datasets:
4
  - Open-Orca/SlimOrca
5
- language:
6
- - en
7
  pipeline_tag: text-generation
8
  inference: false
9
- tags:
10
- - text-generation-inference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
  # 🌟 Falcon-RW-1B-Instruct-OpenOrca
@@ -76,4 +179,17 @@ This model may generate inaccurate or misleading information and is prone to hal
76
  The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
77
 
78
  ## πŸ“¬ Contact
79
- For further inquiries or feedback, please contact at eric.fu96@aol.com.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
+ tags:
6
+ - text-generation-inference
7
  datasets:
8
  - Open-Orca/SlimOrca
 
 
9
  pipeline_tag: text-generation
10
  inference: false
11
+ model-index:
12
+ - name: falcon-rw-1b-instruct-openorca
13
+ results:
14
+ - task:
15
+ type: text-generation
16
+ name: Text Generation
17
+ dataset:
18
+ name: AI2 Reasoning Challenge (25-Shot)
19
+ type: ai2_arc
20
+ config: ARC-Challenge
21
+ split: test
22
+ args:
23
+ num_few_shot: 25
24
+ metrics:
25
+ - type: acc_norm
26
+ value: 34.56
27
+ name: normalized accuracy
28
+ source:
29
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
30
+ name: Open LLM Leaderboard
31
+ - task:
32
+ type: text-generation
33
+ name: Text Generation
34
+ dataset:
35
+ name: HellaSwag (10-Shot)
36
+ type: hellaswag
37
+ split: validation
38
+ args:
39
+ num_few_shot: 10
40
+ metrics:
41
+ - type: acc_norm
42
+ value: 60.93
43
+ name: normalized accuracy
44
+ source:
45
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
46
+ name: Open LLM Leaderboard
47
+ - task:
48
+ type: text-generation
49
+ name: Text Generation
50
+ dataset:
51
+ name: MMLU (5-Shot)
52
+ type: cais/mmlu
53
+ config: all
54
+ split: test
55
+ args:
56
+ num_few_shot: 5
57
+ metrics:
58
+ - type: acc
59
+ value: 28.77
60
+ name: accuracy
61
+ source:
62
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
63
+ name: Open LLM Leaderboard
64
+ - task:
65
+ type: text-generation
66
+ name: Text Generation
67
+ dataset:
68
+ name: TruthfulQA (0-shot)
69
+ type: truthful_qa
70
+ config: multiple_choice
71
+ split: validation
72
+ args:
73
+ num_few_shot: 0
74
+ metrics:
75
+ - type: mc2
76
+ value: 37.42
77
+ source:
78
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
79
+ name: Open LLM Leaderboard
80
+ - task:
81
+ type: text-generation
82
+ name: Text Generation
83
+ dataset:
84
+ name: Winogrande (5-shot)
85
+ type: winogrande
86
+ config: winogrande_xl
87
+ split: validation
88
+ args:
89
+ num_few_shot: 5
90
+ metrics:
91
+ - type: acc
92
+ value: 60.69
93
+ name: accuracy
94
+ source:
95
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
96
+ name: Open LLM Leaderboard
97
+ - task:
98
+ type: text-generation
99
+ name: Text Generation
100
+ dataset:
101
+ name: GSM8k (5-shot)
102
+ type: gsm8k
103
+ config: main
104
+ split: test
105
+ args:
106
+ num_few_shot: 5
107
+ metrics:
108
+ - type: acc
109
+ value: 3.41
110
+ name: accuracy
111
+ source:
112
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
113
+ name: Open LLM Leaderboard
114
  ---
115
 
116
  # 🌟 Falcon-RW-1B-Instruct-OpenOrca
 
179
  The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
180
 
181
  ## πŸ“¬ Contact
182
+ For further inquiries or feedback, please contact at eric.fu96@aol.com.
183
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
184
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ericzzz__falcon-rw-1b-instruct-openorca)
185
+
186
+ | Metric |Value|
187
+ |---------------------------------|----:|
188
+ |Avg. |37.63|
189
+ |AI2 Reasoning Challenge (25-Shot)|34.56|
190
+ |HellaSwag (10-Shot) |60.93|
191
+ |MMLU (5-Shot) |28.77|
192
+ |TruthfulQA (0-shot) |37.42|
193
+ |Winogrande (5-shot) |60.69|
194
+ |GSM8k (5-shot) | 3.41|
195
+