01-ai/Yi-6B-Chat · Introducing AutoRound INT4 algorithm

Apr 25

Hello,
First and foremost, I would like to express my gratitude for your exceptional work and for sharing your model with the community. We have recently applied AutoRound to your model, achieving better results fore int4 model. Below are the accuracies, all tested with real quantized models in the same environment and zero shot tasks.

Metric	BF16	01-ai/Yi-6B-Chat-4bits	INT4
Avg.	0.6043	0.5867	0.5939
mmlu	0.6163	0.6133	0.6119
cmmlu	0.7431	0.7312	0.7314
ceval	0.7355	0.7155	0.7281
gsm8k	0.3222	0.2866	0.3040

Unfortunately, we are unable to upload the quantized model due to licensing constraints. Therefore, we would appreciate it if you could generate it yourself by following the recipe links, and we are here to provide assistance. Additionally, we would greatly appreciate it if you could consider using our method to generate quantized models for your new models in the future.

YShow

Apr 30

cool, may I ask what the license restriction specifically refers to? The quantitative model for our model allows self uploading to HF. In addition, you can submit a PR to the Ecosystem section of our repo, and we can recommend your model.

wenhuach

Apr 30

Hi,
As our company has very strict legal review and this process typically takes a long time, so we could not uploaded it, at least for now. So we would appreciate that you could try and generate it yourself.
Besides, we have supported local data and combination of different datasets for calibration, e.g, --dataset "./tmp.json,NeelNanda/pile-10k" . Using you own train data for calibration may lead to better accuracy. A soft reminder, we drop the samples < args.seqlen for now

YShow

Apr 30

Got it. I will use your method to generate a quantized model and upload it!

wenhuach

May 6

Thank you for your kind understanding

wenhuach changed discussion status to closed May 6