TheBloke/EverythingLM-13b-V2-16K-GGML · The responses are just infinite blank lines

Aug 23, 2023

As per the title, I tried using the exact settings from your example with this model and it just basically hits the enter key endlessly filling the entire terminal with blank lines

dillfrescott

Aug 23, 2023

This is in llama.cpp by the way. I forgot to mention that!

TheBloke

Owner Aug 23, 2023

Yeah I need to make this clearer in future. You need to do this as well:

Change -c 2048 to the desired sequence length for this model. For example, -c 4096 for a Llama 2 model. For models that use RoPE, add --rope-freq-base 10000 --rope-freq-scale 0.5 for doubled context, or --rope-freq-base 10000 --rope-freq-scale 0.25 for 4x context.

In this example, as this model is trained to 16K context length, you need:

-c 16384 --rope-freq-base 10000 --rope-freq-scale 0.25

This won't be an issue any more with GGUF, as those settings will be embedded in the model. I will be providing GGUF files in the coming days.

dillfrescott

Aug 23, 2023

Oh okay! Thank you!