AIronMind commited on
Commit
462ebbf
1 Parent(s): de3a869

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +153 -0
README.md ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: false
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - language
8
+ - granite-3.0
9
+ - llama-cpp
10
+ - gguf-my-repo
11
+ new_version: ibm-granite/granite-3.1-1b-a400m-base
12
+ base_model: ibm-granite/granite-3.0-1b-a400m-base
13
+ model-index:
14
+ - name: granite-3.0-1b-a400m-base
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ dataset:
19
+ name: MMLU
20
+ type: human-exams
21
+ metrics:
22
+ - type: pass@1
23
+ value: 25.69
24
+ name: pass@1
25
+ - type: pass@1
26
+ value: 11.38
27
+ name: pass@1
28
+ - type: pass@1
29
+ value: 19.96
30
+ name: pass@1
31
+ - task:
32
+ type: text-generation
33
+ dataset:
34
+ name: WinoGrande
35
+ type: commonsense
36
+ metrics:
37
+ - type: pass@1
38
+ value: 62.43
39
+ name: pass@1
40
+ - type: pass@1
41
+ value: 39
42
+ name: pass@1
43
+ - type: pass@1
44
+ value: 35.76
45
+ name: pass@1
46
+ - type: pass@1
47
+ value: 75.35
48
+ name: pass@1
49
+ - type: pass@1
50
+ value: 64.92
51
+ name: pass@1
52
+ - type: pass@1
53
+ value: 39.49
54
+ name: pass@1
55
+ - task:
56
+ type: text-generation
57
+ dataset:
58
+ name: BoolQ
59
+ type: reading-comprehension
60
+ metrics:
61
+ - type: pass@1
62
+ value: 65.44
63
+ name: pass@1
64
+ - type: pass@1
65
+ value: 17.78
66
+ name: pass@1
67
+ - task:
68
+ type: text-generation
69
+ dataset:
70
+ name: ARC-C
71
+ type: reasoning
72
+ metrics:
73
+ - type: pass@1
74
+ value: 38.14
75
+ name: pass@1
76
+ - type: pass@1
77
+ value: 24.41
78
+ name: pass@1
79
+ - type: pass@1
80
+ value: 29.84
81
+ name: pass@1
82
+ - type: pass@1
83
+ value: 33.99
84
+ name: pass@1
85
+ - task:
86
+ type: text-generation
87
+ dataset:
88
+ name: HumanEval
89
+ type: code
90
+ metrics:
91
+ - type: pass@1
92
+ value: 21.95
93
+ name: pass@1
94
+ - type: pass@1
95
+ value: 23.2
96
+ name: pass@1
97
+ - task:
98
+ type: text-generation
99
+ dataset:
100
+ name: GSM8K
101
+ type: math
102
+ metrics:
103
+ - type: pass@1
104
+ value: 19.26
105
+ name: pass@1
106
+ - type: pass@1
107
+ value: 8.96
108
+ name: pass@1
109
+ ---
110
+
111
+ # AIronMind/granite-3.0-1b-a400m-base-Q4_K_M-GGUF
112
+ This model was converted to GGUF format from [`ibm-granite/granite-3.0-1b-a400m-base`](https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-base) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
113
+ Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-base) for more details on the model.
114
+
115
+ ## Use with llama.cpp
116
+ Install llama.cpp through brew (works on Mac and Linux)
117
+
118
+ ```bash
119
+ brew install llama.cpp
120
+
121
+ ```
122
+ Invoke the llama.cpp server or the CLI.
123
+
124
+ ### CLI:
125
+ ```bash
126
+ llama-cli --hf-repo AIronMind/granite-3.0-1b-a400m-base-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-base-q4_k_m.gguf -p "The meaning to life and the universe is"
127
+ ```
128
+
129
+ ### Server:
130
+ ```bash
131
+ llama-server --hf-repo AIronMind/granite-3.0-1b-a400m-base-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-base-q4_k_m.gguf -c 2048
132
+ ```
133
+
134
+ Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
135
+
136
+ Step 1: Clone llama.cpp from GitHub.
137
+ ```
138
+ git clone https://github.com/ggerganov/llama.cpp
139
+ ```
140
+
141
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
142
+ ```
143
+ cd llama.cpp && LLAMA_CURL=1 make
144
+ ```
145
+
146
+ Step 3: Run inference through the main binary.
147
+ ```
148
+ ./llama-cli --hf-repo AIronMind/granite-3.0-1b-a400m-base-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-base-q4_k_m.gguf -p "The meaning to life and the universe is"
149
+ ```
150
+ or
151
+ ```
152
+ ./llama-server --hf-repo AIronMind/granite-3.0-1b-a400m-base-Q4_K_M-GGUF --hf-file granite-3.0-1b-a400m-base-q4_k_m.gguf -c 2048
153
+ ```