Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,26 @@ tags:
|
|
20 |
- llama-3
|
21 |
---
|
22 |
|
|
|
|
|
|
|
23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|Weight Quantization| PPL |
|
25 |
|-------------------|--------------------|
|
26 |
| FP16 | 4.1892 +/- 0.01430 |
|
@@ -28,4 +47,4 @@ tags:
|
|
28 |
|
29 |
Dataset used for re-calibration: Mix of [standard_cal_data](https://github.com/turboderp/exllamav2/tree/master/exllamav2/conversion/standard_cal_data)
|
30 |
|
31 |
-
The generated `imatrix` can be downloaded from [imatrix.dat]()
|
|
|
20 |
- llama-3
|
21 |
---
|
22 |
|
23 |
+
## Model Information
|
24 |
+
The Llama 3.1 instruction tuned text only 70B model is optimized for multilingual dialogue use cases
|
25 |
+
and outperform many of the available open source and closed chat models on common industry benchmarks.
|
26 |
|
27 |
+
This repository stores a experimental IQ_1S quantized GGUF Llama 3.1 instruction tuned 70B model.
|
28 |
+
|
29 |
+
** Model developer **: Meta
|
30 |
+
|
31 |
+
** Model Architecture **: Llama 3.1 is an auto-regressive language model
|
32 |
+
that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT)
|
33 |
+
and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness
|
34 |
+
and safety.
|
35 |
+
|
36 |
+
| |Training Data |Params|Input modalities |Output modalities |Context length|GQA|Token count|Knowledge cutoff|
|
37 |
+
|---------------------|--------------------------------------------|------|-----------------|--------------------------|--------------|---|-----------|----------------|
|
38 |
+
|Llama 3.1 (text only)|A new mix of publicly available online data.|70B |Multilingual Text|Multilingual Text and code|128k |Yes|15T+ |December 2023 |
|
39 |
+
|
40 |
+
** Supported languages **: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
|
41 |
+
|
42 |
+
# Quantization Information
|
43 |
|Weight Quantization| PPL |
|
44 |
|-------------------|--------------------|
|
45 |
| FP16 | 4.1892 +/- 0.01430 |
|
|
|
47 |
|
48 |
Dataset used for re-calibration: Mix of [standard_cal_data](https://github.com/turboderp/exllamav2/tree/master/exllamav2/conversion/standard_cal_data)
|
49 |
|
50 |
+
The generated `imatrix` can be downloaded from [imatrix.dat](https://huggingface.co/npc0/Meta-Llama-3.1-70B-Instruct-IQ_1S/resolve/main/imatrix.dat)
|