Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,9 @@ A quantized version of [Granite Guardian 3.1 2B](https://huggingface.co/ibm-gran
|
|
15 |
|
16 |
Quantization is done by [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
17 |
|
|
|
|
|
|
|
18 |
## Model Summary (from original repository)
|
19 |
|
20 |
**Granite Guardian 3.1 2B** is a fine-tuned Granite 3.1 2B Instruct model designed to detect risks in prompts and responses.
|
|
|
15 |
|
16 |
Quantization is done by [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
17 |
|
18 |
+
P.S. The llama.cpp library encountered issues during model initialization in both Python and llama-server modes, even with the quantized 8B version from other distributors. However, you can use [LM Studio](https://lmstudio.ai/) for inference!
|
19 |
+
|
20 |
+
|
21 |
## Model Summary (from original repository)
|
22 |
|
23 |
**Granite Guardian 3.1 2B** is a fine-tuned Granite 3.1 2B Instruct model designed to detect risks in prompts and responses.
|