cortecs
/

Meta-Llama-3-70B-Instruct-GPTQ

@@ -25,42 +25,42 @@ curl http://localhost:8000/v1/completions     -H "Content-Type: application/json
 ```
 ## Evaluations
-| __English__   | __Llama-3 70B Instruct__   | __Llama 3 70B GPTQ__   | __Mixtral Instruct__   |
-|:--------------|:---------------------------|:-----------------------|:-----------------------|
-| Avg.          | 76.19                      | 75.14                  | 73.17                  |
-| ARC           | 71.6                       | 70.7                   | 71.0                   |
-| Hellaswag     | 77.3                       | 76.4                   | 77.0                   |
-| MMLU          | 79.66                      | 78.33                  | 71.52                  |
-|               |                            |                        |                        |
-| __French__   | __Llama-3 70B Instruct__   | __Llama 3 70B GPTQ__   | __Mixtral Instruct__   |
-| Avg.         | 70.97                      | 70.27                  | 68.7                   |
-| ARC_fr       | 65.0                       | 64.7                   | 63.9                   |
-| Hellaswag_fr | 72.4                       | 71.4                   | 77.1                   |
-| MMLU_fr      | 75.5                       | 74.7                   | 65.1                   |
-|              |                            |                        |                        |
-| __German__   | __Llama-3 70B Instruct__   | __Llama 3 70B GPTQ__   | __Mixtral Instruct__   |
-| Avg.         | 68.43                      | 66.93                  | 66.47                  |
-| ARC_de       | 64.2                       | 62.6                   | 62.8                   |
-| Hellaswag_de | 67.8                       | 66.7                   | 72.1                   |
-| MMLU_de      | 73.3                       | 71.5                   | 64.5                   |
-|              |                            |                        |                        |
-| __Italian__   | __Llama-3 70B Instruct__   | __Llama 3 70B GPTQ__   | __Mixtral Instruct__   |
-| Avg.          | 70.17                      | 68.63                  | 67.17                  |
-| ARC_it        | 64.0                       | 62.1                   | 63.8                   |
-| Hellaswag_it  | 72.6                       | 71.0                   | 75.6                   |
-| MMLU_it       | 73.9                       | 72.8                   | 62.1                   |
-|               |                            |                        |                        |
-| __Safety__          | __Llama-3 70B Instruct__   | __Llama 3 70B GPTQ__   | __Mixtral Instruct__   |
-| Avg.                | 64.28                      | 63.64                  | 63.56                  |
-| RealToxicityPrompts | 97.9                       | 98.1                   | 93.2                   |
-| TruthfulQA          | 61.91                      | 59.91                  | 64.61                  |
-| CrowS               | 33.04                      | 32.92                  | 32.86                  |
-|                     |                            |                        |                        |
-| __Spanish__   |   __Llama-3 70B Instruct__ |   __Llama 3 70B GPTQ__ |   __Mixtral Instruct__ |
-| Avg.          |                       72.5 |                   71.3 |                   68.8 |
-| ARC_es        |                       66.7 |                   65.7 |                   64.4 |
-| Hellaswag_es  |                       75.8 |                   74   |                   77.5 |
-| MMLU_es       |                       75   |                   74.2 |                   64.6 |
 Take with caution. We did not check for data contamination.
      Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000` for big datasets.

 ```
 ## Evaluations
+| __English__   | __Llama-3 70B Instruct__   | __Llama 3 70B Instruct GPTQ__   | __Mixtral Instruct__   |
+|:--------------|:---------------------------|:--------------------------------|:-----------------------|
+| Avg.          | 76.19                      | 75.14                           | 73.17                  |
+| ARC           | 71.6                       | 70.7                            | 71.0                   |
+| Hellaswag     | 77.3                       | 76.4                            | 77.0                   |
+| MMLU          | 79.66                      | 78.33                           | 71.52                  |
+|               |                            |                                 |                        |
+| __French__   | __Llama-3 70B Instruct__   | __Llama 3 70B Instruct GPTQ__   | __Mixtral Instruct__   |
+| Avg.         | 70.97                      | 70.27                           | 68.7                   |
+| ARC_fr       | 65.0                       | 64.7                            | 63.9                   |
+| Hellaswag_fr | 72.4                       | 71.4                            | 77.1                   |
+| MMLU_fr      | 75.5                       | 74.7                            | 65.1                   |
+|              |                            |                                 |                        |
+| __German__   | __Llama-3 70B Instruct__   | __Llama 3 70B Instruct GPTQ__   | __Mixtral Instruct__   |
+| Avg.         | 68.43                      | 66.93                           | 66.47                  |
+| ARC_de       | 64.2                       | 62.6                            | 62.8                   |
+| Hellaswag_de | 67.8                       | 66.7                            | 72.1                   |
+| MMLU_de      | 73.3                       | 71.5                            | 64.5                   |
+|              |                            |                                 |                        |
+| __Italian__   | __Llama-3 70B Instruct__   | __Llama 3 70B Instruct GPTQ__   | __Mixtral Instruct__   |
+| Avg.          | 70.17                      | 68.63                           | 67.17                  |
+| ARC_it        | 64.0                       | 62.1                            | 63.8                   |
+| Hellaswag_it  | 72.6                       | 71.0                            | 75.6                   |
+| MMLU_it       | 73.9                       | 72.8                            | 62.1                   |
+|               |                            |                                 |                        |
+| __Safety__          | __Llama-3 70B Instruct__   | __Llama 3 70B Instruct GPTQ__   | __Mixtral Instruct__   |
+| Avg.                | 64.28                      | 63.64                           | 63.56                  |
+| RealToxicityPrompts | 97.9                       | 98.1                            | 93.2                   |
+| TruthfulQA          | 61.91                      | 59.91                           | 64.61                  |
+| CrowS               | 33.04                      | 32.92                           | 32.86                  |
+|                     |                            |                                 |                        |
+| __Spanish__   |   __Llama-3 70B Instruct__ |   __Llama 3 70B Instruct GPTQ__ |   __Mixtral Instruct__ |
+| Avg.          |                       72.5 |                            71.3 |                   68.8 |
+| ARC_es        |                       66.7 |                            65.7 |                   64.4 |
+| Hellaswag_es  |                       75.8 |                            74   |                   77.5 |
+| MMLU_es       |                       75   |                            74.2 |                   64.6 |
 Take with caution. We did not check for data contamination.
      Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000` for big datasets.