bartowski
/

granite-3.1-3b-a800m-instruct-GGUF

Text Generation

GGUF

language

granite-3.1

conversational

Model card Files Files and versions Community

bartowski commited on 16 days ago

Commit

d23fb5d

•

1 Parent(s): 518373c

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +6 -7

README.md CHANGED Viewed

@@ -1,17 +1,11 @@
 ---
 quantized_by: bartowski
 pipeline_tag: text-generation
-tags:
-- language
-- granite-3.1
-license: apache-2.0
-inference: false
-base_model: ibm-granite/granite-3.1-3b-a800m-instruct
 ---
 ## Llamacpp imatrix Quantizations of granite-3.1-3b-a800m-instruct
-Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b4341">b4341</a> for quantization.
 Original model: https://huggingface.co/ibm-granite/granite-3.1-3b-a800m-instruct
@@ -27,10 +21,15 @@ Run them in [LM Studio](https://lmstudio.ai/)
 <|start_of_role|>assistant<|end_of_role|>
 ```
 ## Download a file (not the whole branch) from below:
 | Filename | Quant type | File Size | Split | Description |
 | -------- | ---------- | --------- | ----- | ----------- |
 | [granite-3.1-3b-a800m-instruct-Q8_0.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q8_0.gguf) | Q8_0 | 3.51GB | false | Extremely high quality, generally unneeded but max available quant. |
 | [granite-3.1-3b-a800m-instruct-Q6_K_L.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K_L.gguf) | Q6_K_L | 2.73GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
 | [granite-3.1-3b-a800m-instruct-Q6_K.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K.gguf) | Q6_K | 2.71GB | false | Very high quality, near perfect, *recommended*. |

 ---
 quantized_by: bartowski
 pipeline_tag: text-generation
 ---
 ## Llamacpp imatrix Quantizations of granite-3.1-3b-a800m-instruct
+Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b4369">b4369</a> for quantization.
 Original model: https://huggingface.co/ibm-granite/granite-3.1-3b-a800m-instruct
 <|start_of_role|>assistant<|end_of_role|>
 ```
+## What's new:
+Fix tokenizer
 ## Download a file (not the whole branch) from below:
 | Filename | Quant type | File Size | Split | Description |
 | -------- | ---------- | --------- | ----- | ----------- |
+| [granite-3.1-3b-a800m-instruct-f16.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-f16.gguf) | f16 | 6.60GB | false | Full F16 weights. |
 | [granite-3.1-3b-a800m-instruct-Q8_0.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q8_0.gguf) | Q8_0 | 3.51GB | false | Extremely high quality, generally unneeded but max available quant. |
 | [granite-3.1-3b-a800m-instruct-Q6_K_L.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K_L.gguf) | Q6_K_L | 2.73GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
 | [granite-3.1-3b-a800m-instruct-Q6_K.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K.gguf) | Q6_K | 2.71GB | false | Very high quality, near perfect, *recommended*. |