The responses are just infinite blank lines
#2
by
dillfrescott
- opened
As per the title, I tried using the exact settings from your example with this model and it just basically hits the enter key endlessly filling the entire terminal with blank lines
This is in llama.cpp by the way. I forgot to mention that!
Yeah I need to make this clearer in future. You need to do this as well:
Change -c 2048 to the desired sequence length for this model. For example, -c 4096 for a Llama 2 model. For models that use RoPE, add --rope-freq-base 10000 --rope-freq-scale 0.5 for doubled context, or --rope-freq-base 10000 --rope-freq-scale 0.25 for 4x context.
In this example, as this model is trained to 16K context length, you need:
-c 16384 --rope-freq-base 10000 --rope-freq-scale 0.25
This won't be an issue any more with GGUF, as those settings will be embedded in the model. I will be providing GGUF files in the coming days.
Oh okay! Thank you!