bartowski commited on
Commit
d23fb5d
1 Parent(s): 518373c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -1,17 +1,11 @@
1
  ---
2
  quantized_by: bartowski
3
  pipeline_tag: text-generation
4
- tags:
5
- - language
6
- - granite-3.1
7
- license: apache-2.0
8
- inference: false
9
- base_model: ibm-granite/granite-3.1-3b-a800m-instruct
10
  ---
11
 
12
  ## Llamacpp imatrix Quantizations of granite-3.1-3b-a800m-instruct
13
 
14
- Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b4341">b4341</a> for quantization.
15
 
16
  Original model: https://huggingface.co/ibm-granite/granite-3.1-3b-a800m-instruct
17
 
@@ -27,10 +21,15 @@ Run them in [LM Studio](https://lmstudio.ai/)
27
  <|start_of_role|>assistant<|end_of_role|>
28
  ```
29
 
 
 
 
 
30
  ## Download a file (not the whole branch) from below:
31
 
32
  | Filename | Quant type | File Size | Split | Description |
33
  | -------- | ---------- | --------- | ----- | ----------- |
 
34
  | [granite-3.1-3b-a800m-instruct-Q8_0.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q8_0.gguf) | Q8_0 | 3.51GB | false | Extremely high quality, generally unneeded but max available quant. |
35
  | [granite-3.1-3b-a800m-instruct-Q6_K_L.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K_L.gguf) | Q6_K_L | 2.73GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
36
  | [granite-3.1-3b-a800m-instruct-Q6_K.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K.gguf) | Q6_K | 2.71GB | false | Very high quality, near perfect, *recommended*. |
 
1
  ---
2
  quantized_by: bartowski
3
  pipeline_tag: text-generation
 
 
 
 
 
 
4
  ---
5
 
6
  ## Llamacpp imatrix Quantizations of granite-3.1-3b-a800m-instruct
7
 
8
+ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b4369">b4369</a> for quantization.
9
 
10
  Original model: https://huggingface.co/ibm-granite/granite-3.1-3b-a800m-instruct
11
 
 
21
  <|start_of_role|>assistant<|end_of_role|>
22
  ```
23
 
24
+ ## What's new:
25
+
26
+ Fix tokenizer
27
+
28
  ## Download a file (not the whole branch) from below:
29
 
30
  | Filename | Quant type | File Size | Split | Description |
31
  | -------- | ---------- | --------- | ----- | ----------- |
32
+ | [granite-3.1-3b-a800m-instruct-f16.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-f16.gguf) | f16 | 6.60GB | false | Full F16 weights. |
33
  | [granite-3.1-3b-a800m-instruct-Q8_0.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q8_0.gguf) | Q8_0 | 3.51GB | false | Extremely high quality, generally unneeded but max available quant. |
34
  | [granite-3.1-3b-a800m-instruct-Q6_K_L.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K_L.gguf) | Q6_K_L | 2.73GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
35
  | [granite-3.1-3b-a800m-instruct-Q6_K.gguf](https://huggingface.co/bartowski/granite-3.1-3b-a800m-instruct-GGUF/blob/main/granite-3.1-3b-a800m-instruct-Q6_K.gguf) | Q6_K | 2.71GB | false | Very high quality, near perfect, *recommended*. |