Using the text-generation-webui api with the model

by Hovav - opened Jun 1, 2023

Jun 1, 2023

Hi, i'm trying to use the text-generation-webui api to run the model. The line i'm running: python server.py --api --api-blocking-port 8827 --api-streaming-port 8815 --model TheBloke_guanaco-65B-GPTQ --wbits 4 --chat .
It loads the model correctl, i'm connecting to api but when i'm trying to send prompt it gives the message:
File "/home/users///text-generation-webui/repositories/GPTQ-for-LLaMa/quant.py", line 426, in forward
quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: torch.Tensor, arg5: torch.Tensor) -> None

When im running the model using the webui everything works good.

Any advice? Thanks!

TheBloke

Owner Jun 1, 2023

Firstly just checking you have 48GB of VRAM available? If not, I wouldn't recommend using this model.

If so, then this error looks to be because of some issues with your GPTQ-for-LLaMa install. How have you installed text-generation-webui and GPTQ-for-LLaMa? Did you recently try upgrading or changing GPTQ-for-LLaMa?

Hovav

Jun 1, 2023

Yes, I have more then 48GB of VRAM. When i'm accessing the textgen webui and loading the model using 4bit everything works correctly, I can send prompts and it generates the text so I don't think its environment problem. I've installed the text gen webui using the one-click installer for linux. For testing the api I'm using the script api-example-chat.py in the text-generation-webui folder. The api working good for other models but not for the guanaco-65B-GPTQ. Maybe its configuration problem?

Hovav

Jun 1, 2023

Sorry, it seems it was environment problem after all. I reinstalled it and now it worked. Thanks for the quick reply!

TheBloke

Owner Jun 1, 2023

Great, glad it's working

sat7166

Aug 3, 2023

@Hovav , I am very new to text-gen web UI. Is it possible to load my local models as API keys, like Open Ai keys?
For example, I've been following many tutorials, and most of them use open_ai keys, I instead want to use my local models instead. Is there a way to do this?

If possible, please point me towards articles/blogs/tutorials that do this.
Thanks.

TheBloke

Owner Aug 5, 2023

@Sat7166 yes it's possible. text-generation-webui has its own API which you can use. And it has an extension which provides an OpenAI compatible API - ie you can hit text-generation-webui using exactly the same code as you would hit OpenAI. Check the text-generation-webui Github for more details

I can't find any tutorials on it, but there's info in their Github and people discussing it in various places, so you can try Googling for more info

sat7166

Aug 6, 2023

Thanks for your reply @TheBloke , I'll check it out.
Also, I wanted to thank you for your work. I am new to LLMs, but I like them very, very much. I've never really been so hyper-focused on anything before, and I love this feeling of working on new LLM-related projects.
You are a big part in helping me develop this as I have a m1 pro setup, and GGML versions are really my saviour here xD.
Though I'd be very happy if you could point me towards online resources where I could learn more about LLM's, what makes them tick and how to optimize them in the right way. I have of course read through a lot of articles but it gets kinda overwhelming sometimes.
Thanks

ahtripleblind

Aug 8, 2023

Hi Sat some extra info on the API messaging supported can be found here: https://github.com/oobabooga/text-generation-webui/blob/main/api-examples/api-example-chat-stream.py

I am also looking for some more info on it and will post as i come across it!

sat7166

Aug 9, 2023

@ ahtripleblind, thanks :-)

flake9

Aug 21, 2023

@Hovav i am implementing kind of similar use case. Are you running this on Runpod?

Kaizen55

Aug 21, 2023

As a side note I ended up using huggingface chatui. The documentation is a lot better defined and clear

Best of luck!!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment