File size: 5,860 Bytes
774ac2d 52fc83f 774ac2d 3fd97ba 774ac2d 4b908a6 bb62d63 cadf631 f4c03b3 f1199fc 08cde87 53d4d48 774ac2d aa57091 8e96ea4 57c4898 a0f8431 ed6fb09 bedfc99 2e343ae 0f4950e 13808c4 53dc0ef 0bbcb03 53d4d48 cb337c8 53cc670 33e24f5 53d4d48 e230147 53d4d48 da4f3b7 53d4d48 cbb8123 774ac2d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
base_model: Qwen/Qwen2.5-3B
language:
- en
library_name: transformers
license: other
license_link: https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE
license_name: qwen-research
quantized_by: mradermacher
---
## About
<!-- ### quantize_version: 2 -->
<!-- ### output_tensor_quantised: 1 -->
<!-- ### convert_type: hf -->
<!-- ### vocab_type: -->
<!-- ### tags: nicoboss -->
weighted/imatrix quants of https://huggingface.co/Qwen/Qwen2.5-3B
<!-- provided-files -->
static quants are available at https://huggingface.co/mradermacher/Qwen2.5-3B-GGUF
## Usage
If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.
## Provided Quants
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ1_S.gguf) | i1-IQ1_S | 0.9 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ1_M.gguf) | i1-IQ1_M | 1.0 | mostly desperate |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 1.0 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_XS.gguf) | i1-IQ2_XS | 1.1 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_S.gguf) | i1-IQ2_S | 1.2 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ2_M.gguf) | i1-IQ2_M | 1.2 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q2_K_S.gguf) | i1-Q2_K_S | 1.3 | very low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q2_K.gguf) | i1-Q2_K | 1.4 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 1.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_XS.gguf) | i1-IQ3_XS | 1.5 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q3_K_S.gguf) | i1-Q3_K_S | 1.6 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_S.gguf) | i1-IQ3_S | 1.6 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ3_M.gguf) | i1-IQ3_M | 1.6 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q3_K_M.gguf) | i1-Q3_K_M | 1.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q3_K_L.gguf) | i1-Q3_K_L | 1.8 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ4_XS.gguf) | i1-IQ4_XS | 1.8 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0_4_4.gguf) | i1-Q4_0_4_4 | 1.9 | fast on arm, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0_4_8.gguf) | i1-Q4_0_4_8 | 1.9 | fast on arm+i8mm, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0_8_8.gguf) | i1-Q4_0_8_8 | 1.9 | fast on arm+sve, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-IQ4_NL.gguf) | i1-IQ4_NL | 1.9 | prefer IQ4_XS |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_0.gguf) | i1-Q4_0 | 1.9 | fast, low quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_K_S.gguf) | i1-Q4_K_S | 1.9 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_K_M.gguf) | i1-Q4_K_M | 2.0 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q4_1.gguf) | i1-Q4_1 | 2.1 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_K_S.gguf) | i1-Q5_K_S | 2.3 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_0.gguf) | i1-Q5_0 | 2.3 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_K_M.gguf) | i1-Q5_K_M | 2.3 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q5_1.gguf) | i1-Q5_1 | 2.4 | |
| [GGUF](https://huggingface.co/mradermacher/Qwen2.5-3B-i1-GGUF/resolve/main/Qwen2.5-3B.i1-Q6_K.gguf) | i1-Q6_K | 2.6 | practically like static Q6_K |
Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):
![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
## FAQ / Model Request
See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.
## Thanks
I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time. Additional thanks to [@nicoboss](https://huggingface.co/nicoboss) for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.
<!-- end -->
|