"Mega Model"?

by Enderchef - opened 14 days ago

Discussion

Enderchef

14 days ago

A cool idea would be to take all the datasets you have, and finetune qwen3 0.6B on them all at once for a Mega Model

Liontix

TeichAI org 12 days ago

Hello @Enderchef ,

Training an LLM on that many different datasets won't result in a good model because they have the same question pool. It would only have a useful outcome if the different datasets would cover specialized topics.
As an alternative would you want a distill based on one of my datasets based on Qwen 0.6 B ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment