DavidAU commited on
Commit
e5ea2b6
1 Parent(s): af3f3a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -3
README.md CHANGED
@@ -1,3 +1,57 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ <font color=red><h3> Ultra High Remaster of the incredible: Psyonic-Cetacean-20b. </h3></font>
5
+
6
+ This is a Floating Point 32 upscale, where all components and merges were remastered to floating point 32.
7
+ This includes all the merges (recreated with master files), and where possible subbing full FP32 models.
8
+
9
+ The goal: Carry forward maximum precision right up to the point where it is "GUFFed".
10
+
11
+ This includes F32 master file for GGUF too... at a whopping 78 GBs.
12
+
13
+ Why?
14
+
15
+ Because the difference between F32 vs BF16 is... 8 DECIMAL places.
16
+
17
+ And as each merge / model is modified there are "losses" along the way.
18
+
19
+ These losses are carried forward and in turn lead to more losses.
20
+
21
+ Small?
22
+
23
+ Yes... but multipled by each merge(s), and compression(s): 20 billion times.
24
+
25
+ <B>The result:</b>
26
+
27
+ At Q2K an impressive drop of 533 points in perplexity. (lower is better)
28
+ (VS: Q2K orginal base model: PPL = 9.8077 +/- 0.06821 )
29
+
30
+ At Q4KM an incredible drop of 976 points in perplexity.
31
+ (VS: Q4km orginal base model -> PPL = 8.7858 +/- 0.06074)
32
+
33
+ At Q6 an awesome drop of 234 points in perplexity.
34
+ (VS: Q6 orginal base model -> PPL = 8.6070 +/- 0.05907 )
35
+
36
+ To put this in perspective "Q6" now operates ABOVE the orginal full precision version of "Psyonic-Cetacean-20b".
37
+
38
+ This because at "Q6" the quant / compressed model is considered to be accurate within "+0.0008 ppl" of the full, uncompressed / unquanted model and it exceeds this by over 200 points.
39
+
40
+ <B>The bottom line here is:</b>
41
+
42
+ Higher quality instruction following and output.
43
+
44
+ Likewise you can use a smaller compression, with higher token per second and still get great quality.
45
+
46
+ Same great model... turbo charged.
47
+
48
+ This is the first group of remasters.
49
+
50
+ <B>More Coming soon...</B>
51
+
52
+ It will be followed by a "reg quant plus", which added additional components into the GGUF (all) at floating point 32
53
+ precision to further increases the sheer creatity and raw AI horsepower.
54
+
55
+ Following this group will be a full precision Imatrix, and Imatrix Plus repo that will push the limit even more.
56
+
57
+ Details of all methods employed to make this high precision remasters will be posted shortly along with comparsions of orginal model and new ultra remaster.