GGUF
Not-For-All-Audiences
nsfw
Inference Endpoints

GGUF conversion is incorrect

#2
by Noseu - opened

Your description says that you applied the PR https://github.com/ggerganov/llama.cpp/pull/6920
However, it fails the test in the PR.

The following always returns incorrect results
What is 3333+777?

Owner

If you load it with Kobold, you will see that it's a GGUF I made after llama.cpp update. Erf. Maybe I will redo them again?

It is weird. I see your GGUF has the right the tokenizer.ggml.pre llama-bpe flag yet when I look at it in the debugger the tokenization is broken in your version of GGUF.
I don't have enough space to try the non-GGUF version with pytorch to see if same issue is with the original unquantized version.

Owner

It is weird. I see your GGUF has the right the tokenizer.ggml.pre llama-bpe flag yet when I look at it in the debugger the tokenization is broken in your version of GGUF.
I don't have enough space to try the non-GGUF version with pytorch to see if an issue with the original unquantized version.

No worries, I will check later

Thanks!
Specifically, right now its
string: '3333+777?'
input tokens: [ '33':1644, '33':1644, '+':10, '777':15831, '?':30 ]

It should be
['333':8765, '3':18, '+':10, '777':15831, '?':30]

Sign up or log in to comment