Seq len
#1
by
Hypersniper
- opened
So just to clarify, if the seq len of the quantized model shows 8k that means I can't use the full 16k? What should I set my max token and truncate settings to? (Text generation webui)
You can use it at 16K. Please see details under "Explanation of GPTQ parameters"