Failed to predict at token position! Check your context buffer sizes!
Any idea why, when running this in Koboldcpp, (I've tried both Q6_K and i1-Q6_K) it fails to generate every single time with some variant of the following error after the first generation until I reset Kobold? (the exact numbers change of course but the same general error each time.):
Processing Prompt [BLAS] (210 / 210 tokens)init: the tokens of sequence 0 in the input batch have inconsistent sequence positions:
- the last position stored in the memory module of the context (i.e. the KV cache) for sequence 0 is X = 1877
- the tokens for sequence 0 in the input batch have a starting position of Y = 1572
it is required that the sequence positions remain consecutive: Y = X + 1
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1
Failed to predict at token position 1572! Check your context buffer sizes!
error after the first generation until I reset Kobold?
It's rather annoying given the first output is looking quite good across a range of different tasks.
Any idea why, when running this in Koboldcpp, (I've tried both Q6_K and i1-Q6_K) it fails to generate every single time with some variant of the following error after the first generation until I reset Kobold? (the exact numbers change of course but the same general error each time.):
Processing Prompt [BLAS] (210 / 210 tokens)init: the tokens of sequence 0 in the input batch have inconsistent sequence positions: - the last position stored in the memory module of the context (i.e. the KV cache) for sequence 0 is X = 1877 - the tokens for sequence 0 in the input batch have a starting position of Y = 1572 it is required that the sequence positions remain consecutive: Y = X + 1 decode: failed to initialize batch llama_decode: failed to decode, ret = -1 Failed to predict at token position 1572! Check your context buffer sizes!error after the first generation until I reset Kobold?
It's rather annoying given the first output is looking quite good across a range of different tasks.
Alright, you might have figured it out by now but I'll leave the fix here if anyone else comes across this.
Koboldcpp - "nofastforward": true, -- change to true, disable fast forward. You are now good to go.