Edit model card

MKLLM-7B-Instruct-GGUF

GGUF quants of trajkovnikola/MKLLM-7B-Instruct

Script used

from_dir=./MKLLM-7B-Instruct
dir=./MKLLM-7B-Instruct-GGUF
base_precision=BF16
file_base=MKLLM-7B-Instruct

quants=("Q2_K" "Q3_K_S" "Q3_K_M" "Q3_K_L" "Q4_K_S" "Q4_K_M" "Q4_0" "Q4_1" "Q5_K_S" "Q5_K_M" "Q5_0" "Q5_1" "Q6_K" "Q8_0" "IQ3_XS" "IQ3_S" "IQ3_M" "IQ4_XS" "IQ4_NL")


docker run --rm -v "${from_dir}":/repo ghcr.io/ggerganov/llama.cpp:full --convert "/repo" --outtype bf16

mkdir "${dir}"

mv "${from_dir}/ggml-model-bf16.gguf" "${dir}/${file_base}-${base_precision}.gguf"

for quant in ${quants[@]};
do
    echo "###########################"
    echo $quant
    echo "==========================="
    docker run --rm -v "${dir}":/repo ghcr.io/ggerganov/llama.cpp:full --quantize "/repo/${file_base}-${base_precision}.gguf" "/repo/${file_base}-${quant}.gguf" "${quant}"
done
Downloads last month
170
GGUF
Model size
7.24B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .