Failed to predict at token position! Check your context buffer sizes!

#1
by MrParivir - opened

Any idea why, when running this in Koboldcpp, (I've tried both Q6_K and i1-Q6_K) it fails to generate every single time with some variant of the following error after the first generation until I reset Kobold? (the exact numbers change of course but the same general error each time.):

Processing Prompt [BLAS] (210 / 210 tokens)init: the tokens of sequence 0 in the input batch have inconsistent sequence positions:
 - the last position stored in the memory module of the context (i.e. the KV cache) for sequence 0 is X = 1877
 - the tokens for sequence 0 in the input batch have a starting position of Y = 1572
 it is required that the sequence positions remain consecutive: Y = X + 1
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1

Failed to predict at token position 1572! Check your context buffer sizes!

error after the first generation until I reset Kobold?
It's rather annoying given the first output is looking quite good across a range of different tasks.

Any idea why, when running this in Koboldcpp, (I've tried both Q6_K and i1-Q6_K) it fails to generate every single time with some variant of the following error after the first generation until I reset Kobold? (the exact numbers change of course but the same general error each time.):

Processing Prompt [BLAS] (210 / 210 tokens)init: the tokens of sequence 0 in the input batch have inconsistent sequence positions:
 - the last position stored in the memory module of the context (i.e. the KV cache) for sequence 0 is X = 1877
 - the tokens for sequence 0 in the input batch have a starting position of Y = 1572
 it is required that the sequence positions remain consecutive: Y = X + 1
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1

Failed to predict at token position 1572! Check your context buffer sizes!

error after the first generation until I reset Kobold?
It's rather annoying given the first output is looking quite good across a range of different tasks.

Alright, you might have figured it out by now but I'll leave the fix here if anyone else comes across this.
Koboldcpp - "nofastforward": true, -- change to true, disable fast forward. You are now good to go.

Sign up or log in to comment