File size: 9,859 Bytes
5c362f5
3a11717
ca180a7
 
 
3a11717
ca180a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5c362f5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---

license: other
license_link: https://llama.meta.com/llama3/license/
base_model: meta-llama/Llama-3.3-70B-Instruct

---
This is a quantization of the [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct).

The Meta Llama 3.3 is a state-of-the-art multilingual large language model (LLM) with 70 billion parameters, pretrained and instruction-tuned for exceptional performance in generative text-based tasks. Optimized for multilingual dialogue, it supports English and seven additional languages: French, German, Hindi, Italian, Portuguese, Spanish, and Thai, enabling seamless communication across diverse audiences. The model consistently outperforms both open-source and proprietary chat models on key industry benchmarks, delivering superior quality, safety, and helpfulness. Its advanced features and multilingual support position Llama 3.3 as a powerful tool for building innovative AI applications.
## Evaluations
This model provides an accuracy recovery of 99.67%. 

| __English__   | __[Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct)__   | __[Llama-3.3-70B-Instruct-FP8-Dynamic (this)](https://huggingface.co/cortecs--Llama-3.3-70B-Instruct-FP8-Dynamic)__   |
|:--------------|:------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------|
| Avg.          | 74.1                                                                                      | 73.75                                                                                                                 |
| Arc           | 71.7                                                                                      | 71.6                                                                                                                  |
| Hellaswag     | 76.5                                                                                      | 75.9                                                                                                                  |
|               |                                                                                           |                                                                                                                       |
| __French__   | __[Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct)__   | __[Llama-3.3-70B-Instruct-FP8-Dynamic (this)](https://huggingface.co/cortecs--Llama-3.3-70B-Instruct-FP8-Dynamic)__   |
| Avg.         | 73.07                                                                                     | 72.87                                                                                                                 |
| Arc          | 64.7                                                                                      | 64.5                                                                                                                  |
| Hellaswag    | 76.6                                                                                      | 76.6                                                                                                                  |
| MMLU         | 77.9                                                                                      | 77.5                                                                                                                  |
|              |                                                                                           |                                                                                                                       |
| __German__   | __[Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct)__   | __[Llama-3.3-70B-Instruct-FP8-Dynamic (this)](https://huggingface.co/cortecs--Llama-3.3-70B-Instruct-FP8-Dynamic)__   |
| Avg.         | 70.07                                                                                     | 69.83                                                                                                                 |
| Arc          | 61.8                                                                                      | 61.2                                                                                                                  |
| Hellaswag    | 71.2                                                                                      | 71.1                                                                                                                  |
| MMLU         | 77.2                                                                                      | 77.2                                                                                                                  |
|              |                                                                                           |                                                                                                                       |
| __Italian__   | __[Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct)__   | __[Llama-3.3-70B-Instruct-FP8-Dynamic (this)](https://huggingface.co/cortecs--Llama-3.3-70B-Instruct-FP8-Dynamic)__   |
| Avg.          | 73.67                                                                                     | 73.37                                                                                                                 |
| Arc           | 66.5                                                                                      | 65.7                                                                                                                  |
| Hellaswag     | 76.0                                                                                      | 76.2                                                                                                                  |
| MMLU          | 78.5                                                                                      | 78.2                                                                                                                  |
|               |                                                                                           |                                                                                                                       |
| __Portuguese__   | __[Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct)__   | __[Llama-3.3-70B-Instruct-FP8-Dynamic (this)](https://huggingface.co/cortecs--Llama-3.3-70B-Instruct-FP8-Dynamic)__   |
| Avg.             | 74.4                                                                                      | 73.87                                                                                                                 |
| Arc              | 66.4                                                                                      | 65.5                                                                                                                  |
| Hellaswag        | 77.2                                                                                      | 76.9                                                                                                                  |
| MMLU             | 79.6                                                                                      | 79.2                                                                                                                  |
|                  |                                                                                           |                                                                                                                       |
| __Spanish__   |   __[Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama--Llama-3.3-70B-Instruct)__ |   __[Llama-3.3-70B-Instruct-FP8-Dynamic (this)](https://huggingface.co/cortecs--Llama-3.3-70B-Instruct-FP8-Dynamic)__ |
| Avg.          |                                                                                      74   |                                                                                                                 74.13 |
| Arc           |                                                                                      65.8 |                                                                                                                 65.8  |
| Hellaswag     |                                                                                      77.1 |                                                                                                                 77.2  |
| MMLU          |                                                                                      79.1 |                                                                                                                 79.4  |

We did not check for data contamination.
     Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) with `limit=1000`. 
    
## Usage
Install **vLLM** and 
    run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server):
    
```
python -m vllm.entrypoints.openai.api_server --model cortecs/Llama-3.3-70B-Instruct-FP8-Dynamic
```
Access the model:
```
curl http://localhost:8000/v1/completions     -H "Content-Type: application/json"     -d ' {
        "model": "cortecs/Llama-3.3-70B-Instruct-FP8-Dynamic",
        "prompt": "San Francisco is a"
    } '
```
⚡ This model is optimized to handle heavy workloads providing a total throughput of ️**1485 tokens per second** using one NVIDIA H100 ⚡