ktoprakucar
/

granite-guardian-3.1-2b-Q8-GGUF

Text Generation

8-bit precision

Inference Endpoints

Model card Files Files and versions Community

ktoprakucar commited on 16 days ago

Commit

eec63b3

·

verified ·

1 Parent(s): 9a3b64e

Update README.md

Files changed (1) hide show

README.md +0 -2

README.md CHANGED Viewed

@@ -15,8 +15,6 @@ A quantized version of [Granite Guardian 3.1 2B](https://huggingface.co/ibm-gran
 Quantization is done by [llama.cpp](https://github.com/ggerganov/llama.cpp).
-P.S. The llama.cpp library encountered issues during model initialization in both Python and llama-server modes, even with the quantized 8B version from other distributors. However, you can use [LM Studio](https://lmstudio.ai/) for inference!
 ## Model Summary (from original repository)


15
16	Quantization is done by [llama.cpp](https://github.com/ggerganov/llama.cpp).
17


18
19	## Model Summary (from original repository)
20