Adding Evaluation Results

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show

README.md +121 -5

README.md CHANGED Viewed

@@ -1,13 +1,116 @@
 ---
 license: apache-2.0
 datasets:
 - Open-Orca/SlimOrca
-language:
-- en
 pipeline_tag: text-generation
 inference: false
-tags:
-- text-generation-inference
 ---
 # 🌟 Falcon-RW-1B-Instruct-OpenOrca
@@ -76,4 +179,17 @@ This model may generate inaccurate or misleading information and is prone to hal
 The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
 ## 📬 Contact
-For further inquiries or feedback, please contact at eric.fu96@aol.com.

 ---
+language:
+- en
 license: apache-2.0
+tags:
+- text-generation-inference
 datasets:
 - Open-Orca/SlimOrca
 pipeline_tag: text-generation
 inference: false
+model-index:
+- name: falcon-rw-1b-instruct-openorca
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: acc_norm
+      value: 34.56
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HellaSwag (10-Shot)
+      type: hellaswag
+      split: validation
+      args:
+        num_few_shot: 10
+    metrics:
+    - type: acc_norm
+      value: 60.93
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU (5-Shot)
+      type: cais/mmlu
+      config: all
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 28.77
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: TruthfulQA (0-shot)
+      type: truthful_qa
+      config: multiple_choice
+      split: validation
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: mc2
+      value: 37.42
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Winogrande (5-shot)
+      type: winogrande
+      config: winogrande_xl
+      split: validation
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 60.69
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k (5-shot)
+      type: gsm8k
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 3.41
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
+      name: Open LLM Leaderboard
 ---
 # 🌟 Falcon-RW-1B-Instruct-OpenOrca
 The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
 ## 📬 Contact
+For further inquiries or feedback, please contact at eric.fu96@aol.com.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ericzzz__falcon-rw-1b-instruct-openorca)
+|             Metric              |Value|
+|---------------------------------|----:|
+|Avg.                             |37.63|
+|AI2 Reasoning Challenge (25-Shot)|34.56|
+|HellaSwag (10-Shot)              |60.93|
+|MMLU (5-Shot)                    |28.77|
+|TruthfulQA (0-shot)              |37.42|
+|Winogrande (5-shot)              |60.69|
+|GSM8k (5-shot)                   | 3.41|