Failed to predict at token position! Check your context buffer sizes!

by MrParivir - opened Aug 10

Aug 10

Any idea why, when running this in Koboldcpp, (I've tried both Q6_K and i1-Q6_K) it fails to generate every single time with some variant of the following error after the first generation until I reset Kobold? (the exact numbers change of course but the same general error each time.):

Processing Prompt [BLAS] (210 / 210 tokens)init: the tokens of sequence 0 in the input batch have inconsistent sequence positions:
 - the last position stored in the memory module of the context (i.e. the KV cache) for sequence 0 is X = 1877
 - the tokens for sequence 0 in the input batch have a starting position of Y = 1572
 it is required that the sequence positions remain consecutive: Y = X + 1
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1

Failed to predict at token position 1572! Check your context buffer sizes!

error after the first generation until I reset Kobold?
It's rather annoying given the first output is looking quite good across a range of different tasks.

atopwhether

Sep 9

Any idea why, when running this in Koboldcpp, (I've tried both Q6_K and i1-Q6_K) it fails to generate every single time with some variant of the following error after the first generation until I reset Kobold? (the exact numbers change of course but the same general error each time.):
Processing Prompt [BLAS] (210 / 210 tokens)init: the tokens of sequence 0 in the input batch have inconsistent sequence positions:
 - the last position stored in the memory module of the context (i.e. the KV cache) for sequence 0 is X = 1877
 - the tokens for sequence 0 in the input batch have a starting position of Y = 1572
 it is required that the sequence positions remain consecutive: Y = X + 1
decode: failed to initialize batch
llama_decode: failed to decode, ret = -1

Failed to predict at token position 1572! Check your context buffer sizes!
error after the first generation until I reset Kobold?
It's rather annoying given the first output is looking quite good across a range of different tasks.

Alright, you might have figured it out by now but I'll leave the fix here if anyone else comes across this.
Koboldcpp - "nofastforward": true, -- change to true, disable fast forward. You are now good to go.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment