Dataset used for quantisation
#31
by
CarlosAndrea
- opened
Hello,
what dataset did you use for quantisation and with how many samples?
In my experience, the number of samples is the key to achieve good inference quality
CarlosAndrea
changed discussion title from
Dataset used for quantization
to Dataset used for quantisation
I notice that the paper uses the c4 dataset (which is somewhat broken on HuggingFace). May be a bit more diverse than wikitext? I'm not sure.
I'm trying to understand as well to what extent the choice of quantization dataset affects perplexity outside of that dataset. I've reached out to the paper's authors.