File size: 5,860 Bytes
774ac2d
 
 
 
 
 
 
 
 
 
 
 
52fc83f
 
 
 
 
 
774ac2d
 
3fd97ba
774ac2d
 
 
 
 
 
 
 
 
 
 
 
4b908a6
bb62d63
cadf631
f4c03b3
f1199fc
08cde87
53d4d48
774ac2d
aa57091
8e96ea4
57c4898
a0f8431
ed6fb09
bedfc99
2e343ae
0f4950e
13808c4
53dc0ef
0bbcb03
53d4d48
cb337c8
53cc670
33e24f5
53d4d48
e230147
53d4d48
da4f3b7
53d4d48
cbb8123
774ac2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
base_model: Qwen/Qwen2.5-3B
language:
- en
library_name: transformers
license: other
license_link: https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE
license_name: qwen-research
quantized_by: mradermacher
---
## About

<!-- ### quantize_version: 2 -->
<!-- ### output_tensor_quantised: 1 -->
<!-- ### convert_type: hf -->
<!-- ### vocab_type:  -->
<!-- ### tags: nicoboss -->
weighted/imatrix quants of https://huggingface.co/Qwen/Qwen2.5-3B

<!-- provided-files -->
static quants are available at https://huggingface.co/mradermacher/Qwen2.5-3B-GGUF
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ1_S.gguf) | i1-IQ1_S | 0.9 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ1_M.gguf) | i1-IQ1_M | 1.0 | mostly desperate |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 1.0 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_XS.gguf) | i1-IQ2_XS | 1.1 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_S.gguf) | i1-IQ2_S | 1.2 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_M.gguf) | i1-IQ2_M | 1.2 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q2_K_S.gguf) | i1-Q2_K_S | 1.3 | very low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q2_K.gguf) | i1-Q2_K | 1.4 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 1.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_XS.gguf) | i1-IQ3_XS | 1.5 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q3_K_S.gguf) | i1-Q3_K_S | 1.6 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_S.gguf) | i1-IQ3_S | 1.6 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_M.gguf) | i1-IQ3_M | 1.6 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q3_K_M.gguf) | i1-Q3_K_M | 1.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q3_K_L.gguf) | i1-Q3_K_L | 1.8 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ4_XS.gguf) | i1-IQ4_XS | 1.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0_4_4.gguf) | i1-Q4_0_4_4 | 1.9 | fast on arm, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0_4_8.gguf) | i1-Q4_0_4_8 | 1.9 | fast on arm+i8mm, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0_8_8.gguf) | i1-Q4_0_8_8 | 1.9 | fast on arm+sve, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ4_NL.gguf) | i1-IQ4_NL | 1.9 | prefer IQ4_XS |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0.gguf) | i1-Q4_0 | 1.9 | fast, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_K_S.gguf) | i1-Q4_K_S | 1.9 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_K_M.gguf) | i1-Q4_K_M | 2.0 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_1.gguf) | i1-Q4_1 | 2.1 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_K_S.gguf) | i1-Q5_K_S | 2.3 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_0.gguf) | i1-Q5_0 | 2.3 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_K_M.gguf) | i1-Q5_K_M | 2.3 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_1.gguf) | i1-Q5_1 | 2.4 |  |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q6_K.gguf) | i1-Q6_K | 2.6 | practically like static Q6_K |

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time. Additional thanks to [@nicoboss](https://huggingface.co/nicoboss) for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.

<!-- end -->