Heya!

#1
by Spacellary - opened

Is this GGUF version of Kunoichi-7B notably different than TheBloke's? Or you just wanted to have your own files for wherever reason?

Just curious! 😊

Hi @Spacellary
It's just done with the latest llama.cpp release from last week. I wouldn't say it's different, but I am using the main branch from llama.cpp and I like the models to be compatible with the latest for my serving.

I see! Are there any expected improvements from recent additions to master branch to be gained?

I see! Are there any expected improvements from recent additions to master branch to be gained?

They do lots of optimizations and release rapidly, not sure if it's directly connected to the convert to GGUF or quantizations. But I know they improved 2-bit quite a lot in the last few weeks, I am pretty happy with some of the 2-bit I made from some merges. But I mostly do it to be sure the latest llama.cpp can server these GGUF models when it's used for API serving

Spacellary changed discussion status to closed

Sign up or log in to comment