Model idea
#1
by
Enderchef
- opened
It'd be cool if you fine tuned all these datasets onto Qwen/Qwen3-4B-Thinking-2507
Hello @Enderchef ,
I started the fine tuning process, it should not take that long, I will reply here once it's published.
Hey @Enderchef ,
The model is ready, please let me know if this matches your expectations.
Download it here.
Would you share the source code how to fine tune it?
Eh, so this isn't RLHF tuning, just SFT ?
Or because you use the reasoning model as base so you just direct the fine tuning in SFT method?