argilla
/

distilabeled-Marcoro14-7B-slerp

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

plaguss HF staff commited on Jan 11

Commit

f8ef65e

•

1 Parent(s): d488715

Add nous benchmark

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -18,3 +18,17 @@ tags:
     <img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>
   </a>
 </p>

     <img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>
   </a>
 </p>
+## Benchmark results
+For benchmarking we used the famous "Nous" or "Teknium" benchmark. You can find below an overview, including our first experiment with a less ambitious dataset filtering (removing ties and `score>5`).
+For running the benchmark we used another awesome contribution from Maxime: [LLM AutoEval](https://github.com/mlabonne/llm-autoeval), check it out!
+|          Model          |AGIEval|GPT4ALL|TruthfulQA|Bigbench|Average|
+|-------------------------|------:|------:|---------:|-------:|------:|
+|[argilla/distilabeled-Marcoro14-7B-slerp](https://huggingface.co/argilla/distilabeled-Marcoro14-7B-slerp)|   **45.4**|  **76.47**|     **65.46**|   **47.19**|  **58.63**|
+|[Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp)       |  44.66|  76.24|     64.15|   45.64|  57.67|
+|[argilla/distilabeled-Hermes-2.5-Mistral-7B](https://huggingface.co/argilla/distilabeled-Hermes-2.5-Mistral-7B)   |   44.64 |   73.35 |      55.96 |    42.21 |   54.04 |