Support of 1M context doubt

by clyang33 - opened Sep 11

Sep 11

Hi Qwen Team,

You guys were using DCA for enabling 1M context on the https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507 model why switch over to yarn. I am just curious it is being left off?

clyang33 changed discussion title from Support of 1M context to Support of 1M context doubt Sep 11

djuna

Sep 12

I think this key
sparse_attention_config
Is not model agnostic, they need to search the optimal value. Give it some time, maybe they will add It later.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment