Vui Seng Chua
commited on
Commit
•
769e2c6
1
Parent(s):
73d74ae
Revise README.md
Browse files
README.md
CHANGED
@@ -2,11 +2,13 @@
|
|
2 |
|
3 |
This repo contains binary of weight quantized by [OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/254-llm-chatbot/254-llm-chatbot.ipynb).
|
4 |
|
|
|
5 |
| LLM | ratio | group_size |
|
6 |
|----------------- |------- |------------ |
|
7 |
| llama-2-chat-7b | 0.8 | 128 |
|
8 |
| mistral-7b | 0.6 | 64 |
|
9 |
| gemma-2b-it | 0.6 | 64 |
|
|
|
10 |
|
11 |
Notes:
|
12 |
* ratio=0.8 means 80% of FC (linear) layers are 4-bit weight quantized and the rest in 8-bit.
|
|
|
2 |
|
3 |
This repo contains binary of weight quantized by [OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/254-llm-chatbot/254-llm-chatbot.ipynb).
|
4 |
|
5 |
+
```
|
6 |
| LLM | ratio | group_size |
|
7 |
|----------------- |------- |------------ |
|
8 |
| llama-2-chat-7b | 0.8 | 128 |
|
9 |
| mistral-7b | 0.6 | 64 |
|
10 |
| gemma-2b-it | 0.6 | 64 |
|
11 |
+
```
|
12 |
|
13 |
Notes:
|
14 |
* ratio=0.8 means 80% of FC (linear) layers are 4-bit weight quantized and the rest in 8-bit.
|