hope try Qwen1.5-14B MOE

#29
by JiangMaster - opened

First of all thank you,may I hope you can try "Qwen1.5-14B MOE" if you want,4x14B or 8x14B

Hi @JiangMaster

Sure, I can try that. This time I might use Qwen1.5-14B-Chat that comes already fine-tuned with 4x14B or 8x14B. (if I used the base, I need to fine-tune it and I am not sure I can fine-tune 4x14B or 8x14B)

Hi, which framework do you use to create this Qwen MoE with different size?

Hi,
They are not different sizes, all the models are 7B. Is that what you mean?

I mean compared to Qwen1.5-MoE official model which uses 1.8Bx7, you used 7B models. How was the process? Did you merge 7b models and then SFT using a framework?

I understand now, thanks. Here are the steps I took preparing this model: https://huggingface.co/MaziyarPanahi/Qwen1.5-8x7b-v0.1#model-description

  • Fine-tuned Qwen1.5-7B wiht Crystalcareai/MoD-150k dataset
  • Create a raw MoE with 8x fine-tuned Qwen1.5-7B
  • Finally, fine-tuned the MoE model again on Crystalcareai/MoD-150k dataset

Would love to do this on 14B if I can find resources.

Sign up or log in to comment