Rename quantize_config.json to quantization_config.json
It seems like optimum.gptq.load_quantized_model
loads quantization_config from quantization_config.json
No it loads it from config.json: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ/blob/main/config.json#L23-L32
quantize_config.json is for AutoGPTQ. The files have been tested with Transformers and Optimum and are fine.
Hmm I don't think you can load this from a custom path. The path for loading the model into memory is fine, but then model.save_pretrained()
to a path, the following:
from accelerate import init_empty_weights
from optimum.gptq import load_quantized_model
# disable exllama if gptq is loaded on CPU
disable_exllama = not torch.cuda.is_available()
with init_empty_weights():
empty = auto_class.from_pretrained(llm.model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, device_map='auto')
empty.tie_weights()
model = load_quantized_model(empty, save_folder="/path/to/saved", device_map='auto', disable_exllama=disable_exllama)
runs into the following issue:
model = load_quantized_model(empty, save_folder="/home/ubuntu/gptq-13b-local", device_map='auto', disable_exllama=disable_exllama)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/.pyenv/versions/3.11.4/lib/python3.11/site-packages/optimum/gptq/quantizer.py", line 614, in load_quantized_model
with open(os.path.join(save_folder, quant_config_name), "r", encoding="utf-8") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/ubuntu/gptq-13b-local/quantization_config.json'
While I do believe this should also be fixed on optimum's load_quantized_model
to check config.json, idk the release schedule from the optimum team so would be nice to also have a quantization_config.json
Could you raise this as an issue on the Optimum Github. They're doing a release soon to fix another issue related to GPTQ so maybe they'll look at this soon, or have already fixed it
Yes I have sent them a issue wrt to this