error when uing the text-generation-webui api with the model

#12
by carlosbdw - opened

GPU: A40(48GB) * 1
CPU: 15 vCPU AMD EPYC 7543 32-Core Processor
MEM: 80GB

/root/text-generation-webui
bin /root/miniconda3/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda116.so
INFO:Loading guanaco-65B-GPTQ...
CUDA extension not installed.
INFO:Found the following quantized model: models/guanaco-65B-GPTQ/Guanaco-65B-GPTQ-4bit.act-order.safetensors
Traceback (most recent call last):
File "/root/text-generation-webui/server.py", line 1102, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "/root/text-generation-webui/modules/models.py", line 97, in load_model
output = load_func(model_name)
File "/root/text-generation-webui/modules/models.py", line 291, in GPTQ_loader
model = modules.GPTQ_loader.load_quantized(model_name)
File "/root/text-generation-webui/modules/GPTQ_loader.py", line 177, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File "/root/text-generation-webui/modules/GPTQ_loader.py", line 84, in _load_quant
model.load_state_dict(safe_load(checkpoint), strict=False)
File "/root/miniconda3/lib/python3.10/site-packages/safetensors/torch.py", line 259, in load_file
with safe_open(filename, framework="pt", device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer

This looks like the model didn't download fully. Check you have enough disk space, and then please try the download again. text-gen-ui's downloader will auto resume, so you don't need to download the whole thing again, it will download whatever parts are missing.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment