GGUF
conversational

Model idea

#1
by Enderchef - opened

It'd be cool if you fine tuned all these datasets onto Qwen/Qwen3-4B-Thinking-2507

Owner

Hello @Enderchef ,

I started the fine tuning process, it should not take that long, I will reply here once it's published.

Owner

Hey @Enderchef ,

The model is ready, please let me know if this matches your expectations.
Download it here.

Would you share the source code how to fine tune it?

Owner

Hello @Hamora ,

I uploaded the safetensor files of this model here, the model card includes a link to an unsloth Jupyter notebook you can use to fine-tune it.

Eh, so this isn't RLHF tuning, just SFT ?
Or because you use the reasoning model as base so you just direct the fine tuning in SFT method?

@Liontix How do you think a 30b-a3b Moe fine-tune would work?

Hello @PSM24

It's technically possible, but on my current setup it doesn't work because it lacks the required VRAM. But I am experimenting with other smaller MoE models as they may perform significantly better than regular sized models on consumer hardware.

Sign up or log in to comment