salamandra-2b-instruct / quanization_results.md
robbiemu's picture
files
e15c783

Full Perplexity Comparison Table for Release Documentation

Quantization Type PPL(Q) ln(PPL(Q)/PPL(fp16)) File Size (G)
IQ2_S 25.3893 0.501266 1.6
IQ2_M 21.6684 0.342794 1.6
Q3_K_M 16.8567 0.091687 1.8
IQ3_M 16.774 0.086769 1.7
Q3_K_L 16.5067 0.070705 1.8
IQ4_NL 15.9602 0.037037 1.9
IQ4_XS 15.9591 0.036968 1.8
Q4_K_S 15.9346 0.035431 1.9
Q4_K_M 15.8651 0.031060 2.0
Q5_K_S 15.4901 0.007140 2.1
Q5_K_M 15.4746 0.006139 2.2
Q6_K 15.3961 0.001053 2.4
Q8_0 15.3831 0.000208 2.7
bf16 15.3799 0.000000 4.2

This full table documents all the quantization types tested, showing their respective Perplexity (PPL), ln(PPL(Q)/PPL(fp16)), and file sizes.