Instruct-v0.1 or Instruct-v0.2 based?

by zappa2005 - opened

I couldn't find this information in the model page, hence the question: Did you base on v0.1 or v0.2?

I read somewhere that the v0.1 uses a context extension method SWA (sliding-window attention) which was not working well in some cases, so they changed that in v0.2.

I guess I read it here, but I'm not sure - it was late. Thank you!

NeverSleep org

what? this is Mixtral-8x7B-Instruct-v0.1, not Mistral-7B-Instruct-v0.2, this is a mixtral model, there is no 0.2 instruct for mixtral

IkariDev changed discussion status to closed

Sign up or log in to comment