Issue with --n-gpu-layers 5 Parameter: Model Only Running on CPU

#10

by vuk123 - opened 6 days ago

6 days ago

Hi, I’m facing an issue where the --n-gpu-layers 5 parameter doesn’t seem to work. Despite having 2x NVIDIA A6000 GPUs, the model runs entirely on the CPU, with no GPU utilization. Has anyone else encountered this, or is there a fix for it?

this is how i run model : llama-cli --model /home/user/mymodels/DeepSeek-V3-Q3_K_M/DeepSeek-V3-Q3_K_M-00001-of-00007.gguf --cache-type-k q5_0 --threads 16 --prompt '<｜User｜>What is 1+1?<｜Assistant｜>' --n-gpu-layers 5

vuk123

6 days ago

it look like problem is i installed llama.cpp with brew so its not compiled by cuda...

vuk123

5 days ago

i build it with cmake, now it works...

shimmyshimmer

Unsloth AI org 5 days ago

i build it with cmake, now it works...

glad you got it working!!

gng2info

4 days ago

chenpangpang

2 days ago

i build it with cmake, now it works...

I use the command：

cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release

But the GPU memory is occupied, but the GPU utilization rate is 0, and it seems to be running on the CPU as well

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment