Prompt format
#1
by
underlines
- opened
Thanks for this great merge and quant!
As usual with these merges, mentioning multiple prompt formats :) What is the one that works best for you?
My environment:
- thebloke/cuda11.8.0-ubuntu22.04-oneclick:latest on runpod
- 1 x RTX A6000 / 16 vCPU 62 GB RAM
- ExLlama
- max_seq_len 4096
- compress_pos_emb 2
- LLaMA-Precise (I tried others)
- Instruction Template: I tried Alpaca + Vicuna v1.1
- Mode: I tried chat, chat-instruct and instruct
Always gives me gibberish:
Yep, same thing happened to me for these 30B / 33B SuperHOT models (gibberish output)
Tried:
Guanaco-33B-SuperHOT-8K-GPTQ
WizardLM-33B-V1.0-Uncensored-SuperHOT-8K-GPTQ
Wizard-Vicuna-30B-Superhot-8K-GPTQ
13B SuperHOT models seem working fine.
OS: Ubuntu 22.04
CPU: 32C RAM: 188G
GPU: NVIDIA A10 24G
Driver: 525.105.17
Oobabooga updated to latest version too.
For people using TheBloke's Runpod Template: It didn't update ExLlama, but it's now fixed. Restart your pods or update ExLlama.
underlines
changed discussion status to
closed