bartowski commited on
Commit
48cd030
1 Parent(s): 97d2266

Llamacpp quants

Browse files
.gitattributes CHANGED
@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ c4ai-command-r-08-2024-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
37
+ c4ai-command-r-08-2024-Q4_0_4_4.gguf filter=lfs diff=lfs merge=lfs -text
38
+ c4ai-command-r-08-2024-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ c4ai-command-r-08-2024-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
40
+ c4ai-command-r-08-2024-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,326 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - de
6
+ - es
7
+ - it
8
+ - pt
9
+ - ja
10
+ - ko
11
+ - zh
12
+ - ar
13
+ license: cc-by-nc-4.0
14
+ library_name: transformers
15
+ extra_gated_prompt: "By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and acknowledge that the information you provide will be collected, used, and shared in accordance with Cohere’s [Privacy Policy]( https://cohere.com/privacy)."
16
+ extra_gated_fields:
17
+ Name: text
18
+ Affiliation: text
19
+ Country:
20
+ type: select
21
+ options:
22
+ - Aruba
23
+ - Afghanistan
24
+ - Angola
25
+ - Anguilla
26
+ - Åland Islands
27
+ - Albania
28
+ - Andorra
29
+ - United Arab Emirates
30
+ - Argentina
31
+ - Armenia
32
+ - American Samoa
33
+ - Antarctica
34
+ - French Southern Territories
35
+ - Antigua and Barbuda
36
+ - Australia
37
+ - Austria
38
+ - Azerbaijan
39
+ - Burundi
40
+ - Belgium
41
+ - Benin
42
+ - Bonaire Sint Eustatius and Saba
43
+ - Burkina Faso
44
+ - Bangladesh
45
+ - Bulgaria
46
+ - Bahrain
47
+ - Bahamas
48
+ - Bosnia and Herzegovina
49
+ - Saint Barthélemy
50
+ - Belarus
51
+ - Belize
52
+ - Bermuda
53
+ - Plurinational State of Bolivia
54
+ - Brazil
55
+ - Barbados
56
+ - Brunei-Darussalam
57
+ - Bhutan
58
+ - Bouvet-Island
59
+ - Botswana
60
+ - Central African Republic
61
+ - Canada
62
+ - Cocos (Keeling) Islands
63
+ - Switzerland
64
+ - Chile
65
+ - China
66
+ - Côte-dIvoire
67
+ - Cameroon
68
+ - Democratic Republic of the Congo
69
+ - Cook Islands
70
+ - Colombia
71
+ - Comoros
72
+ - Cabo Verde
73
+ - Costa Rica
74
+ - Cuba
75
+ - Curaçao
76
+ - Christmas Island
77
+ - Cayman Islands
78
+ - Cyprus
79
+ - Czechia
80
+ - Germany
81
+ - Djibouti
82
+ - Dominica
83
+ - Denmark
84
+ - Dominican Republic
85
+ - Algeria
86
+ - Ecuador
87
+ - Egypt
88
+ - Eritrea
89
+ - Western Sahara
90
+ - Spain
91
+ - Estonia
92
+ - Ethiopia
93
+ - Finland
94
+ - Fiji
95
+ - Falkland Islands (Malvinas)
96
+ - France
97
+ - Faroe Islands
98
+ - Federated States of Micronesia
99
+ - Gabon
100
+ - United Kingdom
101
+ - Georgia
102
+ - Guernsey
103
+ - Ghana
104
+ - Gibraltar
105
+ - Guinea
106
+ - Guadeloupe
107
+ - Gambia
108
+ - Guinea Bissau
109
+ - Equatorial Guinea
110
+ - Greece
111
+ - Grenada
112
+ - Greenland
113
+ - Guatemala
114
+ - French Guiana
115
+ - Guam
116
+ - Guyana
117
+ - Hong Kong
118
+ - Heard Island and McDonald Islands
119
+ - Honduras
120
+ - Croatia
121
+ - Haiti
122
+ - Hungary
123
+ - Indonesia
124
+ - Isle of Man
125
+ - India
126
+ - British Indian Ocean Territory
127
+ - Ireland
128
+ - Islamic Republic of Iran
129
+ - Iraq
130
+ - Iceland
131
+ - Israel
132
+ - Italy
133
+ - Jamaica
134
+ - Jersey
135
+ - Jordan
136
+ - Japan
137
+ - Kazakhstan
138
+ - Kenya
139
+ - Kyrgyzstan
140
+ - Cambodia
141
+ - Kiribati
142
+ - Saint-Kitts-and-Nevis
143
+ - South Korea
144
+ - Kuwait
145
+ - Lao-Peoples-Democratic-Republic
146
+ - Lebanon
147
+ - Liberia
148
+ - Libya
149
+ - Saint-Lucia
150
+ - Liechtenstein
151
+ - Sri Lanka
152
+ - Lesotho
153
+ - Lithuania
154
+ - Luxembourg
155
+ - Latvia
156
+ - Macao
157
+ - Saint Martin (French-part)
158
+ - Morocco
159
+ - Monaco
160
+ - Republic of Moldova
161
+ - Madagascar
162
+ - Maldives
163
+ - Mexico
164
+ - Marshall Islands
165
+ - North Macedonia
166
+ - Mali
167
+ - Malta
168
+ - Myanmar
169
+ - Montenegro
170
+ - Mongolia
171
+ - Northern Mariana Islands
172
+ - Mozambique
173
+ - Mauritania
174
+ - Montserrat
175
+ - Martinique
176
+ - Mauritius
177
+ - Malawi
178
+ - Malaysia
179
+ - Mayotte
180
+ - Namibia
181
+ - New Caledonia
182
+ - Niger
183
+ - Norfolk Island
184
+ - Nigeria
185
+ - Nicaragua
186
+ - Niue
187
+ - Netherlands
188
+ - Norway
189
+ - Nepal
190
+ - Nauru
191
+ - New Zealand
192
+ - Oman
193
+ - Pakistan
194
+ - Panama
195
+ - Pitcairn
196
+ - Peru
197
+ - Philippines
198
+ - Palau
199
+ - Papua New Guinea
200
+ - Poland
201
+ - Puerto Rico
202
+ - North Korea
203
+ - Portugal
204
+ - Paraguay
205
+ - State of Palestine
206
+ - French Polynesia
207
+ - Qatar
208
+ - Réunion
209
+ - Romania
210
+ - Russia
211
+ - Rwanda
212
+ - Saudi Arabia
213
+ - Sudan
214
+ - Senegal
215
+ - Singapore
216
+ - South Georgia and the South Sandwich Islands
217
+ - Saint Helena Ascension and Tristan da Cunha
218
+ - Svalbard and Jan Mayen
219
+ - Solomon Islands
220
+ - Sierra Leone
221
+ - El Salvador
222
+ - San Marino
223
+ - Somalia
224
+ - Saint Pierre and Miquelon
225
+ - Serbia
226
+ - South Sudan
227
+ - Sao Tome and Principe
228
+ - Suriname
229
+ - Slovakia
230
+ - Slovenia
231
+ - Sweden
232
+ - Eswatini
233
+ - Sint Maarten (Dutch-part)
234
+ - Seychelles
235
+ - Syrian Arab Republic
236
+ - Turks and Caicos Islands
237
+ - Chad
238
+ - Togo
239
+ - Thailand
240
+ - Tajikistan
241
+ - Tokelau
242
+ - Turkmenistan
243
+ - Timor Leste
244
+ - Tonga
245
+ - Trinidad and Tobago
246
+ - Tunisia
247
+ - Turkey
248
+ - Tuvalu
249
+ - Taiwan
250
+ - United Republic of Tanzania
251
+ - Uganda
252
+ - Ukraine
253
+ - United States Minor Outlying Islands
254
+ - Uruguay
255
+ - United-States
256
+ - Uzbekistan
257
+ - Holy See (Vatican City State)
258
+ - Saint Vincent and the Grenadines
259
+ - Bolivarian Republic of Venezuela
260
+ - Virgin Islands British
261
+ - Virgin Islands U.S.
262
+ - VietNam
263
+ - Vanuatu
264
+ - Wallis and Futuna
265
+ - Samoa
266
+ - Yemen
267
+ - South Africa
268
+ - Zambia
269
+ - Zimbabwe
270
+ Receive email updates on C4AI and Cohere research, events, products and services?:
271
+ type: select
272
+ options:
273
+ - Yes
274
+ - No
275
+ I agree to use this model for non-commercial use ONLY: checkbox
276
+ quantized_by: bartowski
277
+ pipeline_tag: text-generation
278
+ ---
279
+
280
+ ## Llamacpp Static (no imatrix) Quantizations of c4ai-command-r-08-2024
281
+
282
+ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3634">b3634</a> for quantization.
283
+
284
+ Original model: https://huggingface.co/CohereForAI/c4ai-command-r-08-2024
285
+
286
+ ## Prompt format
287
+
288
+ No prompt format
289
+
290
+ ## Download a file (not the whole branch) from below:
291
+
292
+ | Filename | Quant type | File Size | Description |
293
+ | -------- | ---------- | --------- | ----------- |
294
+ | [c4ai-command-r-08-2024-Q8_0.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF/blob/main/c4ai-command-r-08-2024-Q8_0.gguf) | Q8_0 | 34.32GB | Extremely high quality, generally unneeded but max available quant. |
295
+ | [c4ai-command-r-08-2024-Q6_K.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF/blob/main/c4ai-command-r-08-2024-Q6_K.gguf) | Q6_K | 26.50GB | Very high quality, near perfect, *recommended*. |
296
+ | [c4ai-command-r-08-2024-Q5_K_M.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF//main/c4ai-command-r-08-2024-Q5_K_M.gguf) | Q5_K_M | | High quality, *recommended*. |
297
+ | [c4ai-command-r-08-2024-Q4_K_M.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF/blob/main/c4ai-command-r-08-2024-Q4_K_M.gguf) | Q4_K_M | 19.80GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
298
+ | [c4ai-command-r-08-2024-IQ4_NL.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF//main/c4ai-command-r-08-2024-IQ4_NL.gguf) | IQ4_NL | | Decent quality, slightly smaller than Q4_K_S with similar performance *recommended*. |
299
+ | [c4ai-command-r-08-2024-Q3_K_L.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF/blob/main/c4ai-command-r-08-2024-Q3_K_L.gguf) | Q3_K_L | 17.56GB | Lower quality but usable, good for low RAM availability. |
300
+ | [c4ai-command-r-08-2024-Q2_K.gguf](https://huggingface.co/bartowski/c4ai-command-r-08-2024-GGUF//main/c4ai-command-r-08-2024-Q2_K.gguf) | Q2_K | | Very low quality but surprisingly usable. |
301
+
302
+ ## Which file should I choose?
303
+
304
+ A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
305
+
306
+ The first thing to figure out is how big a model you can run. To do this, you'll need to figure out how much RAM and/or VRAM you have.
307
+
308
+ If you want your model running as FAST as possible, you'll want to fit the whole thing on your GPU's VRAM. Aim for a quant with a file size 1-2GB smaller than your GPU's total VRAM.
309
+
310
+ If you want the absolute maximum quality, add both your system RAM and your GPU's VRAM together, then similarly grab a quant with a file size 1-2GB Smaller than that total.
311
+
312
+ Next, you'll need to decide if you want to use an 'I-quant' or a 'K-quant'.
313
+
314
+ If you don't want to think too much, grab one of the K-quants. These are in format 'QX_K_X', like Q5_K_M.
315
+
316
+ If you want to get more into the weeds, you can check out this extremely useful feature chart:
317
+
318
+ [llama.cpp feature matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix)
319
+
320
+ But basically, if you're aiming for below Q4, and you're running cuBLAS (Nvidia) or rocBLAS (AMD), you should look towards the I-quants. These are in format IQX_X, like IQ3_M. These are newer and offer better performance for their size.
321
+
322
+ These I-quants can also be used on CPU and Apple Metal, but will be slower than their K-quant equivalent, so speed vs performance is a tradeoff you'll have to decide.
323
+
324
+ The I-quants are *not* compatible with Vulcan, which is also AMD, so if you have an AMD card double check if you're using the rocBLAS build or the Vulcan build. At the time of writing this, LM Studio has a preview with ROCm support, and other inference engines have specific builds for ROCm.
325
+
326
+ Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
c4ai-command-r-08-2024-Q3_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c67b792c2958c522c88231bf1fd424a77b2ef364fe9df9045920533c4785d010
3
+ size 17563434144
c4ai-command-r-08-2024-Q4_0_4_4.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3279e8b0d42bd9ebcf5b4b2ea380b210021f14ed35536dedc265394173c0f4f4
3
+ size 18719489184
c4ai-command-r-08-2024-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3888013c7649fab42f35135fbff86f33e9f70613a41c8d1376a9c4290e38d8a2
3
+ size 19800833184
c4ai-command-r-08-2024-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:74ea0f17ff7267be8e857c1fb98828df23eeda6ce15ead06a39e1b3e58b204ae
3
+ size 26505165984
c4ai-command-r-08-2024-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f582cc76f845d6bc7412869e511fb1ff1a6815ee6638f4dad093b80f41d0d68
3
+ size 34326887584