converted gguf format model is so slow on inference( is that right?)

by bangbang - opened Dec 29, 2023

Dec 29, 2023

I use KoLLaVA-Synatra-7b by converting gguf format. that gguf model so slow... that i thought i coludn't use this. (못쓸정도로 느립니다.)

I want you to tell me this model slow is true????

Mineru

Apr 12

How did you quantize it? like Q8_0, Q 4_K_M

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment