So fast great job!
#1
by
ubergarm
- opened
I just confirmed the Q8_0 is running well on CPU-only backend tested with both latest mainline llama.cpp and also ik_llama.cpp!
Thanks to you and the llama.cpp team PR17889 paving the way on this one!