Qwen2-7B-Instruct-128k-GGUF / README.md

Vezora

Update README.md

d91f9b1 verified 7 months ago

preview code

raw

history blame

324 Bytes

metadata

license: apache-2.0

The Qwen authors highlight in their blogpost that qwen 2 7b can handle sequences up to 128k, but the GGUF meta-data is set to 32k, I set it to 131k. Using the script llama.cpp script, and this command "python gguf-set-metadata.py qwen2-7b-instruct-q5_k_m.gguf qwen2.context_length 131072 --force"