Python instructions didn't just go
#2
by
Njax
- opened
I'm very new to GPTQ, so please excuse this message if I'm in error. However, I couldn't get the example python code to just work. I wound up changing it to this:
model = AutoGPTQForCausalLM.from_quantized(model_basename,
use_safetensors=True,
trust_remote_code=True,
device="cuda:0",
use_triton=use_triton,
quantize_config=None)
user_input = '''
// A javascript function
function printHelloWorld() {
'''
inputs = tokenizer(user_input, return_tensors="pt").to(model.device)
embedding = model.generate(**inputs,
max_new_tokens=40)[0]
outputs = tokenizer.decode(embedding)
I used cuda 1.17, torch 2.0.1+cu117, and auto-gptq 0.2.2, which perhaps spells the difference.
Thanks for uploading this. Confusing stuff at times but it sure is exciting for something new!