So fast great job!

#1
by ubergarm - opened

I just confirmed the Q8_0 is running well on CPU-only backend tested with both latest mainline llama.cpp and also ik_llama.cpp!

Thanks to you and the llama.cpp team PR17889 paving the way on this one!

Sign up or log in to comment