GGUF
conversational

"Mega Model"?

#1
by Enderchef - opened

A cool idea would be to take all the datasets you have, and finetune qwen3 0.6B on them all at once for a Mega Model

TeichAI org

Hello @Enderchef ,

Training an LLM on that many different datasets won't result in a good model because they have the same question pool. It would only have a useful outcome if the different datasets would cover specialized topics.
As an alternative would you want a distill based on one of my datasets based on Qwen 0.6 B ?

Sign up or log in to comment