npc0
/

Meta-Llama-3.1-70B-Instruct-IQ_1S

Text Generation

Inference Endpoints

Model card Files Files and versions Community

npc0 commited on Aug 26

Commit

0ff335b

•

1 Parent(s): a000b36

Update README.md

Files changed (1) hide show

README.md +20 -1

README.md CHANGED Viewed

@@ -20,7 +20,26 @@ tags:
   - llama-3
 ---
 |Weight Quantization| PPL                |
 |-------------------|--------------------|
 | FP16              | 4.1892 +/- 0.01430 |
@@ -28,4 +47,4 @@ tags:
 Dataset used for re-calibration: Mix of [standard_cal_data](https://github.com/turboderp/exllamav2/tree/master/exllamav2/conversion/standard_cal_data)
-The generated `imatrix` can be downloaded from [imatrix.dat]()

   - llama-3
 ---
+## Model Information
+The Llama 3.1 instruction tuned text only 70B model is optimized for multilingual dialogue use cases
+and outperform many of the available open source and closed chat models on common industry benchmarks.
+This repository stores a experimental IQ_1S quantized GGUF Llama 3.1 instruction tuned 70B model.
+** Model developer **: Meta
+** Model Architecture **: Llama 3.1 is an auto-regressive language model
+that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT)
+and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness
+and safety.
+|                     |Training Data                               |Params|Input modalities |Output modalities         |Context length|GQA|Token count|Knowledge cutoff|
+|---------------------|--------------------------------------------|------|-----------------|--------------------------|--------------|---|-----------|----------------|
+|Llama 3.1 (text only)|A new mix of publicly available online data.|70B    |Multilingual Text|Multilingual Text and code|128k          |Yes|15T+       |December 2023   |
+** Supported languages **: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
+# Quantization Information
 |Weight Quantization| PPL                |
 |-------------------|--------------------|
 | FP16              | 4.1892 +/- 0.01430 |
 Dataset used for re-calibration: Mix of [standard_cal_data](https://github.com/turboderp/exllamav2/tree/master/exllamav2/conversion/standard_cal_data)
+The generated `imatrix` can be downloaded from [imatrix.dat](https://huggingface.co/npc0/Meta-Llama-3.1-70B-Instruct-IQ_1S/resolve/main/imatrix.dat)