Quantized version please
Thanks! Currently making one, it's gonna be uploaded in a few minutes + a chat space to give it a try.
That's great. I'm waiting for both files and chat space.
Sorry it failed. It'll be uploaded in 5-6 hours
Meanwhile you can try my humble attempt sudhir2016/NeuralBeagle14-7B-GGUF till the original work from the master himself is available !!
Thanks @sudhir2016 ! The space is now available here: https://huggingface.co/spaces/mlabonne/NeuralBeagle14-7B-GGUF-Chat (GGUF: https://huggingface.co/mlabonne/NeuralBeagle14-7B-GGUF).
@mlabonne Q4_K_M or Q5_K_M for 7b models? is their amy significant difference? i see that earlier the space was running the Q5 model but you switched to Q4
@mlabonne Q4_K_M or Q5_K_M for 7b models? is their amy significant difference? i see that earlier the space was running the Q5 model but you switched to Q4
Q5_K_M is slightly better, I changed it because the inference was too slow on a CPU.
Thank you @mlabonne for the GGUF version. I really like it! My request is done, but I just don't close the discussion because of the questions asked by pother users.
Thanks @HR1777 sure let's keep it open