how to quantize?

#1
by miraclezst - opened

it will be an error if i try like this:

model = AutoModelForCausalLM.from_pretrained("hiyouga/Llama-2-Chinese-13b-chat").quantize(8).cuda()

Use autogptq

hiyouga changed discussion status to closed

Sign up or log in to comment