Performance Drop due to quantization?

#34

by Teja-Gollapudi - opened Aug 14, 2023

Aug 14, 2023

Hi,
Are there any benchmark comparisions for the Quantized model vs the full model?
I want to gauge the performance drop introduced by quantization.

Thank you!

Teja-Gollapudi changed discussion title from Benchmark comparison to Performance Drop due to quantization? Aug 14, 2023

Sep 11, 2023

•

Did you manage to find any comparison?

Sep 11, 2023

Never got around to doing it 😕.

Sep 12, 2023

4 bits are roughly 95 percent as accurate as full precision model

Sep 21, 2023

I found a couple of subreddits discussing that topic:

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment