Running on llama.cpp

by YvonF - opened Jul 19

Jul 19

•

When trying to run with llama.cpp
./llama.cpp/server --port 8002 --host 0.0.0.0 -m llama.cpp/models/Mistral-Nemo-Instruct-2407-Q5_K_M.gguf -c 128000
I got : error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1

AIImageStudio

Jul 19

When trying to run with llama.cpp
./llama.cpp/server --port 8002 --host 0.0.0.0 -m llama.cpp/models/Mistral-Nemo-Instruct-2407-Q5_K_M.gguf -c 128000
I got : error loading model: create_tensor: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1

llama.cpp not support this model yet.

jovahef

Jul 19

They've just added support for the tokenizer a few hours ago, a few other things to go though.

vasilee

Jul 19

They've just added support for the tokenizer a few hours ago, a few other things to go though.

in what release number is the support?

jirka642

Jul 19

In none of them. It's not merged yet.
https://github.com/ggerganov/llama.cpp/issues/8577
https://github.com/ggerganov/llama.cpp/pull/8579

apepkuss79

Second State org Jul 22

The gguf models have already updated, which are based on llama.cpp b3438. If any further issue, please let us know. Thanks a lot!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment