Text Generation
Transformers
Safetensors
English
llama
text-generation-inference
4-bit precision
awq
TheBloke commited on
Commit
d87af94
1 Parent(s): d37052e

Update for Transformers AWQ support

Browse files
Files changed (1) hide show
  1. config.json +9 -2
config.json CHANGED
@@ -24,5 +24,12 @@
24
  "torch_dtype": "float16",
25
  "transformers_version": "4.34.0.dev0",
26
  "use_cache": true,
27
- "vocab_size": 32003
28
- }
 
 
 
 
 
 
 
 
24
  "torch_dtype": "float16",
25
  "transformers_version": "4.34.0.dev0",
26
  "use_cache": true,
27
+ "vocab_size": 32003,
28
+ "quantization_config": {
29
+ "quant_method": "awq",
30
+ "zero_point": true,
31
+ "group_size": 128,
32
+ "bits": 4,
33
+ "version": "gemm"
34
+ }
35
+ }