Future use of gemma knowledge distillation

#4
by gt332a - opened

Great model! Maybe in the next version you could use knowledge distillation like Google did to make the 9B Gemma to be much more powerful despite its size. I think this could benefit even small models like this one.

Sign up or log in to comment