File size: 1,346 Bytes
e15c783
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
### Full Perplexity Comparison Table for Release Documentation

| Quantization Type | PPL(Q)  | ln(PPL(Q)/PPL(fp16)) | File Size (G) |
|-------------------|---------|---------------------|---------------|
| IQ2_S             | 25.3893 | 0.501266            | 1.6           |
| IQ2_M             | 21.6684 | 0.342794            | 1.6           |
| Q3_K_M            | 16.8567 | 0.091687            | 1.8           |
| IQ3_M             | 16.774  | 0.086769            | 1.7           |
| Q3_K_L            | 16.5067 | 0.070705            | 1.8           |
| IQ4_NL            | 15.9602 | 0.037037            | 1.9           |
| IQ4_XS            | 15.9591 | 0.036968            | 1.8           |
| Q4_K_S            | 15.9346 | 0.035431            | 1.9           |
| Q4_K_M            | 15.8651 | 0.031060            | 2.0           |
| Q5_K_S            | 15.4901 | 0.007140            | 2.1           |
| Q5_K_M            | 15.4746 | 0.006139            | 2.2           |
| Q6_K              | 15.3961 | 0.001053            | 2.4           |
| Q8_0              | 15.3831 | 0.000208            | 2.7           |
| bf16              | 15.3799 | 0.000000            | 4.2           |


---

This full table documents all the quantization types tested, showing their respective **Perplexity (PPL)**, **ln(PPL(Q)/PPL(fp16))**, and **file sizes**.