Which of llama.cpp version should I use
llama.cpp version:
commit 553f1e46e9e864514bbd6bf4009146db66be0541 (HEAD, tag: b4600, origin/master, origin/HEAD)
Author: Olivier Chafik ochafik@users.noreply.github.com
Date: Thu Jan 30 22:01:06 2025 +0000
This is the error log:
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: failed to load model '/home/xxx/Models/DeepSeek-R1-Distill-Qwen-32B-Uncensored-GGUF/DeepSeek-R1-D
istill-Qwen-32B-Uncensored.Q8_0.gguf'
main: error: unable to load model
the quants were done with b4526, and any later version should work. b4600 has support for that pretokenizer, so you are likely not using the version you think you are using, but an older version.
just tested it, loads and works fine with both b4526 and b4600