markoarnauto
commited on
Commit
•
544606b
1
Parent(s):
2314c91
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -1,10 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
1 |
This is a quantized model of [Llama-3 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
|
2 |
using the following configuration:
|
3 |
- 4bit (8bit will follow)
|
4 |
- Act order: True
|
5 |
- Group size: 128
|
6 |
- Seq. length: 4096
|
7 |
-
|
8 |
## Usage
|
9 |
Install **vLLM** and
|
10 |
run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server):
|
|
|
1 |
+
---
|
2 |
+
datasets: wikitext
|
3 |
+
license: apache-2.0
|
4 |
+
license_link: https://llama.meta.com/llama3/license/
|
5 |
+
---
|
6 |
This is a quantized model of [Llama-3 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) using GPTQ developed by [IST Austria](https://ist.ac.at/en/research/alistarh-group/)
|
7 |
using the following configuration:
|
8 |
- 4bit (8bit will follow)
|
9 |
- Act order: True
|
10 |
- Group size: 128
|
11 |
- Seq. length: 4096
|
12 |
+
|
13 |
## Usage
|
14 |
Install **vLLM** and
|
15 |
run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server):
|