Exllama
#1
by
ndurkee
- opened
I just wanted to confirm that this works with Exllamav1. I can't comment on v2 at the moment.
Great, thanks for letting us know!
It works with Exllama v2 (release: 0.0.4).
c:\AI\exllamav2>call .\venv\Scripts\activate & python examples/chat.py --mode raw --model_dir c:\AI\exllamav2\models\Mistral-7B-Instruct-v0.1-GPTQ-4bit-32g-actorder_True
-- Model: c:\AI\exllamav2\models\Mistral-7B-Instruct-v0.1-GPTQ-4bit-32g-actorder_True
-- Options: ['rope_scale 1.0', 'rope_alpha 1.0']
-- Loading model...
-- Loading tokenizer...
User: Hi
Chatbort: Hello! How can I help you today?
This comment has been hidden
Are you finding it slower in exllama v2 than in exllama? I do.