Support of 1M context doubt
#2
by
clyang33
- opened
Hi Qwen Team,
You guys were using DCA for enabling 1M context on the https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507 model why switch over to yarn. I am just curious it is being left off?
clyang33
changed discussion title from
Support of 1M context
to Support of 1M context doubt
I think this keysparse_attention_config
Is not model agnostic, they need to search the optimal value. Give it some time, maybe they will add It later.