Fix chat template including thinking token

#3
by chompk - opened

The original chat template mask {% generation %} tag wrongly. I supposed that the original version was copied from other Qwen3 models that requires reasoning. This model, however, doesn't contain thinking tag, making finetuning with this tokenizer under --assistant_only_loss resulted in wrong assistant masking. This change fix the incorporation of think token for instruction model while also allowing proper masking for instruction tuning.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment