TheBloke
/

Falcon-7B-Instruct-GPTQ

Text Generation

RefinedWebModel

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on May 27, 2023

Commit

7d1999a

•

1 Parent(s): ded9ab7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -105,13 +105,13 @@ print(tokenizer.decode(output[0]))
 ## Provided files
-**Falcon-7B-Instruct-GPTQ-4bit-128g.safetensors**
 This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
 It was created with groupsize 64 to give higher inference quality, and without `desc_act` (act-order) to increase inference speed.
-* `Falcon-7B-Instruct-GPTQ-4bit-128g.safetensors`
   * Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
     * At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
   * Works with text-generation-webui using `--autogptq --trust_remote_code`

 ## Provided files
+**gptq_model-4bit-64g.safetensors**
 This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
 It was created with groupsize 64 to give higher inference quality, and without `desc_act` (act-order) to increase inference speed.
+* `gptq_model-4bit-64g.safetensors`
   * Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
     * At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
   * Works with text-generation-webui using `--autogptq --trust_remote_code`