Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,21 @@ It was created by merging the deltas provided in the above repo with the origina
|
|
10 |
|
11 |
It was then quantized to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
## Provided files
|
14 |
|
15 |
Two model files are provided. Ideally use the `safetensors` file. Full details below:
|
|
|
10 |
|
11 |
It was then quantized to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
|
12 |
|
13 |
+
## My Vicuna 1.1 model repositories
|
14 |
+
|
15 |
+
I have the following Vicuna 1.1 repositories available:
|
16 |
+
|
17 |
+
**13B models:**
|
18 |
+
* [Unquantized 13B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-13B-1.1-HF)
|
19 |
+
* [GPTQ quantized 4bit 13B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g)
|
20 |
+
* [GPTQ quantized 4bit 13B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g-GGML)
|
21 |
+
|
22 |
+
**7B models:**
|
23 |
+
* [Unquantized 7B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-7B-1.1-HF)
|
24 |
+
* [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
|
25 |
+
* [GPTQ quantized 4bit 7B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g-GGML)
|
26 |
+
|
27 |
+
|
28 |
## Provided files
|
29 |
|
30 |
Two model files are provided. Ideally use the `safetensors` file. Full details below:
|