Can i run this on tex-gen-ui? I want to stream the token generation
#3
by
asach
- opened
Please provide some instructions to run this, really appreciate your work and help.
i think is is https://github.com/oobabooga/text-generation-webui?
I was able to run on oobabooga
using 2x 3090
- install oobabooga
- download TheBloke's 4-bit gptq into 'models' directory
- modify the following files
modules/models.py ->
config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=True)
modules/AutoGPTQ_loader.py ->
# Define the params for AutoGPTQForCausalLM.from_quantized
params = {
...
"trust_remote_code": True,
...
}
- run ooba
python server.py --listen --model_type llama --wbits 4 --groupsize -1 --auto-devices
- in models tab, select WizardLM-Uncensored-Falcon-40b
- if it doesn't load, choose 4-bit and reload
- in instructions tab choose prompt instruct-wizardlm
- ask your question. It's slow but it works. The answers are spectacular.
I got it loaded with your instructions, but a nonsense response to the prompt:
### Response:DayGenVerEvEvEv```
Any advice?
Any plans for an uncensored version of the instruct trained falcon 40b?
I plan to train Dolphin on Falcon 40b, which I expect will be much better than falcon-40b-instruct.
I plan to train Dolphin on Falcon 40b, which I expect will be much better than falcon-40b-instruct.
What is your estimation about the release date of this model? Will it be 13b?
Best Model i have tried for reasoning questions. Thank you !