Text Generation
Transformers
English
llm-rs
ggml
Inference Endpoints

Broken

#1
by Aryanne - opened

open_llama_3b-q4_0-ggjt.bin seems to be broken, doesn't run on koboldcpp, it gives an error about dimensionality of some weight somewhere.

rustformers org

koboldcpp is based on llama.cpp which has hardcoded sizes for the different llama architectures. Meaning 3B isn't in there yet. You could contribute it to koboldcpp or use rustformers/llm which calculates model sizes dynamically.

ok, I was running https://huggingface.co/SlyEcho/open_llama_3b_ggml and it was working fine

Sign up or log in to comment