cortecs
/

Meta-Llama-3-70B-Instruct-GPTQ-8b

+---
+datasets: wikitext
+license: other
+license_link: https://llama.meta.com/llama3/license/
+---
+This is a quantized model of [Meta-Llama-3-70B-Instruct.yaml](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct.yaml) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
+ using the following configuration:
+ - 8bit
+- Act order: True
+ - Group size: 128
+## Usage
+Install **vLLM** and
+    run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server):
+```
+python -m vllm.entrypoints.openai.api_server --model cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b
+```
+Access the model:
+```
+curl http://localhost:8000/v1/completions     -H "Content-Type: application/json"     -d ' {
+        "model": "cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b",
+        "prompt": "San Francisco is a"
+    } '
+```
+## Evaluations
+| __English__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
+|:--------------|:-----------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------|
+| Avg.          | 76.19                                                                                          | 76.16                                                                                                       | 75.14                                                                                                 |
+| ARC           | 71.6                                                                                           | 71.4                                                                                                        | 70.7                                                                                                  |
+| Hellaswag     | 77.3                                                                                           | 77.1                                                                                                        | 76.4                                                                                                  |
+| MMLU          | 79.66                                                                                          | 79.98                                                                                                       | 78.33                                                                                                 |
+|               |                                                                                                |                                                                                                             |                                                                                                       |
+| __French__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
+| Avg.         | 70.97                                                                                          | 71.03                                                                                                       | 70.27                                                                                                 |
+| ARC_fr       | 65.0                                                                                           | 65.3                                                                                                        | 64.7                                                                                                  |
+| Hellaswag_fr | 72.4                                                                                           | 72.4                                                                                                        | 71.4                                                                                                  |
+| MMLU_fr      | 75.5                                                                                           | 75.4                                                                                                        | 74.7                                                                                                  |
+|              |                                                                                                |                                                                                                             |                                                                                                       |
+| __German__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
+| Avg.         | 68.43                                                                                          | 68.37                                                                                                       | 66.93                                                                                                 |
+| ARC_de       | 64.2                                                                                           | 64.3                                                                                                        | 62.6                                                                                                  |
+| Hellaswag_de | 67.8                                                                                           | 67.7                                                                                                        | 66.7                                                                                                  |
+| MMLU_de      | 73.3                                                                                           | 73.1                                                                                                        | 71.5                                                                                                  |
+|              |                                                                                                |                                                                                                             |                                                                                                       |
+| __Italian__   | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
+| Avg.          | 70.17                                                                                          | 70.43                                                                                                       | 68.63                                                                                                 |
+| ARC_it        | 64.0                                                                                           | 64.3                                                                                                        | 62.1                                                                                                  |
+| Hellaswag_it  | 72.6                                                                                           | 72.4                                                                                                        | 71.0                                                                                                  |
+| MMLU_it       | 73.9                                                                                           | 74.6                                                                                                        | 72.8                                                                                                  |
+|               |                                                                                                |                                                                                                             |                                                                                                       |
+| __Safety__          | __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__   | __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__   | __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__   |
+| Avg.                | 64.28                                                                                          | 64.17                                                                                                       | 63.64                                                                                                 |
+| RealToxicityPrompts | 97.9                                                                                           | 97.8                                                                                                        | 98.1                                                                                                  |
+| TruthfulQA          | 61.91                                                                                          | 61.67                                                                                                       | 59.91                                                                                                 |
+| CrowS               | 33.04                                                                                          | 33.04                                                                                                       | 32.92                                                                                                 |
+|                     |                                                                                                |                                                                                                             |                                                                                                       |
+| __Spanish__   |   __[Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)__ |   __[Meta-Llama-3-70B-Instruct-GPTQ-8b](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ-8b)__ |   __[Meta-Llama-3-70B-Instruct-GPTQ](https://huggingface.co/cortecs/Meta-Llama-3-70B-Instruct-GPTQ)__ |
+| Avg.          |                                                                                           72.5 |                                                                                                        72.7 |                                                                                                  71.3 |
+| ARC_es        |                                                                                           66.7 |                                                                                                        66.9 |                                                                                                  65.7 |
+| Hellaswag_es  |                                                                                           75.8 |                                                                                                        75.9 |                                                                                                  74   |
+| MMLU_es       |                                                                                           75   |                                                                                                        75.3 |                                                                                                  74.2 |
+We did not check for data contamination.
+     Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) using `limit=1000`.
+## Performance
+|             |   requests/s |   tokens/s |
+|:------------|-------------:|-----------:|
+| NVIDIA L4x4 |         0.27 |     128.81 |
+| NVIDIA L4x8 |         1.31 |     624.61 |
+Performance measured on [cortecs inference](https://cortecs.ai).