license: apache-2.0 | |
license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct-GGUF/blob/main/LICENSE | |
language: | |
- en | |
base_model: | |
- Qwen/Qwen2.5-Coder-32B-Instruct | |
pipeline_tag: text-generation | |
library_name: gguf | |
tags: | |
- code | |
- codeqwen | |
- chat | |
- qwen | |
- qwen-coder | |
# Speculative decoding with Qwen 32B + Qwen 1.5B | |
Example: | |
```sh | |
llama-server \ | |
-m qwen2.5-coder-32b-instruct-q4_k_m.gguf \ | |
-md qwen2.5-coder-1.5b-instruct-q4_k_m.gguf \ | |
--ctx-size 65536 | |
``` | |