GPTQ format has bug for asym kernel, which will cause large accuracy drop for small models and 2 bits. Recommend to use AutoRound Format
· Sign up or log in to comment