Update README.md
Browse files
README.md
CHANGED
@@ -105,13 +105,13 @@ print(tokenizer.decode(output[0]))
|
|
105 |
|
106 |
## Provided files
|
107 |
|
108 |
-
**
|
109 |
|
110 |
This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
|
111 |
|
112 |
It was created with groupsize 64 to give higher inference quality, and without `desc_act` (act-order) to increase inference speed.
|
113 |
|
114 |
-
* `
|
115 |
* Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
|
116 |
* At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
|
117 |
* Works with text-generation-webui using `--autogptq --trust_remote_code`
|
|
|
105 |
|
106 |
## Provided files
|
107 |
|
108 |
+
**gptq_model-4bit-64g.safetensors**
|
109 |
|
110 |
This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
|
111 |
|
112 |
It was created with groupsize 64 to give higher inference quality, and without `desc_act` (act-order) to increase inference speed.
|
113 |
|
114 |
+
* `gptq_model-4bit-64g.safetensors`
|
115 |
* Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
|
116 |
* At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
|
117 |
* Works with text-generation-webui using `--autogptq --trust_remote_code`
|