info on models
Browse files
README.md
CHANGED
@@ -9,13 +9,18 @@ https://github.com/qwopqwop200/GPTQ-for-LLaMa
|
|
9 |
|
10 |
LoRA credit to https://huggingface.co/baseten/alpaca-30b
|
11 |
|
|
|
|
|
|
|
|
|
12 |
# Update 2023-03-27
|
13 |
-
New weights have been added. The old .pt version is no longer supported and has been replaced by a 128 groupsize safetensors file. Update to the latest GPTQ
|
14 |
|
15 |
-
**alpaca-30b-4bit-128g.safetensors**
|
16 |
|
17 |
-
Evals
|
18 |
-----
|
|
|
|
|
19 |
**c4-new** -
|
20 |
6.398105144500732
|
21 |
|
@@ -25,6 +30,21 @@ Evals
|
|
25 |
**wikitext2** -
|
26 |
4.402845859527588
|
27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
# Usage
|
29 |
1. Run manually through GPTQ
|
30 |
2. (More setup but better UI) - Use the [text-generation-webui](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode). Make sure to follow the installation steps first [here](https://github.com/oobabooga/text-generation-webui#installation) before adding GPTQ support.
|
|
|
9 |
|
10 |
LoRA credit to https://huggingface.co/baseten/alpaca-30b
|
11 |
|
12 |
+
# Update 2023-03-29
|
13 |
+
There is also a non-groupsize quantized model that is 1GB smaller in size, which should allow running at max context tokens with 24GB VRAM. The evaluations are better
|
14 |
+
on the 128 groupsize version, but the tradeoff is not being able to run it at full context without offloading or a GPU with more VRAM.
|
15 |
+
|
16 |
# Update 2023-03-27
|
17 |
+
New weights have been added. The old .pt version is no longer supported and has been replaced by a 128 groupsize safetensors file. Update to the latest GPTQ version/webui.
|
18 |
|
|
|
19 |
|
20 |
+
Evals - Groupsize 128 + True Sequential
|
21 |
-----
|
22 |
+
**alpaca-30b-4bit-128g.safetensors** [4805cc2]
|
23 |
+
|
24 |
**c4-new** -
|
25 |
6.398105144500732
|
26 |
|
|
|
30 |
**wikitext2** -
|
31 |
4.402845859527588
|
32 |
|
33 |
+
Evals - Default + True Sequential
|
34 |
+
-----
|
35 |
+
|
36 |
+
**alpaca-30b-4bit.safetensors** [6958004]
|
37 |
+
|
38 |
+
**c4-new** -
|
39 |
+
6.592941761016846
|
40 |
+
|
41 |
+
**ptb-new** -
|
42 |
+
8.718379974365234
|
43 |
+
|
44 |
+
**wikitext2** -
|
45 |
+
4.635514736175537
|
46 |
+
|
47 |
+
|
48 |
# Usage
|
49 |
1. Run manually through GPTQ
|
50 |
2. (More setup but better UI) - Use the [text-generation-webui](https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode). Make sure to follow the installation steps first [here](https://github.com/oobabooga/text-generation-webui#installation) before adding GPTQ support.
|