The Qwen authors highlight in their blogpost that qwen 2 7b can handle sequences up to 128k, but the GGUF meta-data is set to 32k. This is a version with 131k max context length, Using the llama.cpp script, also available here, along with this command: python gguf-set-metadata.py qwen2-7b-instruct-q5_k_m.gguf qwen2.context_length 131072 --force

Downloads last month: 6

GGUF

Model size

7.62B params

Architecture

qwen2

5-bit

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.