|
### Full Perplexity Comparison Table for Release Documentation |
|
|
|
| Quantization Type | PPL(Q) | ln(PPL(Q)/PPL(fp16)) | File Size (G) | |
|
|-------------------|---------|---------------------|---------------| |
|
| IQ2_S | 25.3893 | 0.501266 | 1.6 | |
|
| IQ2_M | 21.6684 | 0.342794 | 1.6 | |
|
| Q3_K_M | 16.8567 | 0.091687 | 1.8 | |
|
| IQ3_M | 16.774 | 0.086769 | 1.7 | |
|
| Q3_K_L | 16.5067 | 0.070705 | 1.8 | |
|
| IQ4_NL | 15.9602 | 0.037037 | 1.9 | |
|
| IQ4_XS | 15.9591 | 0.036968 | 1.8 | |
|
| Q4_K_S | 15.9346 | 0.035431 | 1.9 | |
|
| Q4_K_M | 15.8651 | 0.031060 | 2.0 | |
|
| Q5_K_S | 15.4901 | 0.007140 | 2.1 | |
|
| Q5_K_M | 15.4746 | 0.006139 | 2.2 | |
|
| Q6_K | 15.3961 | 0.001053 | 2.4 | |
|
| Q8_0 | 15.3831 | 0.000208 | 2.7 | |
|
| bf16 | 15.3799 | 0.000000 | 4.2 | |
|
|
|
|
|
--- |
|
|
|
This full table documents all the quantization types tested, showing their respective **Perplexity (PPL)**, **ln(PPL(Q)/PPL(fp16))**, and **file sizes**. |
|
|