Adding Evaluation Results
#2
by
						
leaderboard-pr-bot
	
							
						- opened
							
					
    	
        README.md
    CHANGED
    
    | 
         @@ -155,3 +155,17 @@ ASSISTANT:Silent dove at night, 
     | 
|
| 155 | 
         
             
            Softly cooing in the dark,
         
     | 
| 156 | 
         
             
            Peaceful melody.
         
     | 
| 157 | 
         
             
            ```
         
     | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 
         | 
| 
         | 
|
| 155 | 
         
             
            Softly cooing in the dark,
         
     | 
| 156 | 
         
             
            Peaceful melody.
         
     | 
| 157 | 
         
             
            ```
         
     | 
| 158 | 
         
            +
             
     | 
| 159 | 
         
            +
            # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
         
     | 
| 160 | 
         
            +
            Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_openaccess-ai-collective__minotaur-13b)
         
     | 
| 161 | 
         
            +
             
     | 
| 162 | 
         
            +
            | Metric                | Value                     |
         
     | 
| 163 | 
         
            +
            |-----------------------|---------------------------|
         
     | 
| 164 | 
         
            +
            | Avg.                  | 51.31   |
         
     | 
| 165 | 
         
            +
            | ARC (25-shot)         | 56.4          |
         
     | 
| 166 | 
         
            +
            | HellaSwag (10-shot)   | 79.13    |
         
     | 
| 167 | 
         
            +
            | MMLU (5-shot)         | 49.61         |
         
     | 
| 168 | 
         
            +
            | TruthfulQA (0-shot)   | 49.62   |
         
     | 
| 169 | 
         
            +
            | Winogrande (5-shot)   | 76.56   |
         
     | 
| 170 | 
         
            +
            | GSM8K (5-shot)        | 12.51        |
         
     | 
| 171 | 
         
            +
            | DROP (3-shot)         | 35.33         |
         
     |