Nexesenex commited on
Commit
ab6493f
·
verified ·
1 Parent(s): 3be798b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: llama3.1
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.1
3
+ ---
4
+
5
+ Experimental .GGUF quants for https://huggingface.co/google/gemma-2-9b-it accordingly to LCPP PR
6
+ (based on b_3529 and now b_3565 for the newer ones) : https://github.com/ggerganov/llama.cpp/pull/8836
7
+
8
+ These experimental quant strategies revisiting Ikawrakow's work are displaying a slight decrease of perplexity,
9
+ including per bpw (from 10%+ for the lowest quants to 0.x% for the highest ones).
10
+ This is significant enough to encourage you folks to test them, and provide feedback if pertinent.
11
+
12
+ The iMatrix I use is based on Group Merged V3 and enriched with a bit of French,
13
+ a bit of Serbian, and a bit of Croatian languages.
14
+
15
+
16
+ ARC and PPL-512 DATA (Get the last data on the main post of the PR thread) :
17
+
18
+ ```
19
+
20
+ IQ4_XS
21
+
22
+ Master
23
+ Size : 4.13 GiB (4.42 BPW)
24
+ Arc-C 299 49.16387960
25
+ Arc-E 570 72.10526316
26
+ PPL 512 wikitext : 7.5226 +/- 0.04820
27
+
28
+ IQ4_XSR
29
+
30
+ PR
31
+ Size :
32
+ Arc-C 299
33
+ Arc-E 570
34
+ PPL 512 wikitext :
35
+
36
+ FP16
37
+
38
+ MASTER : Gemma 2 9b It F16.
39
+ Size : 14.96 GiB (16.00 BPW)
40
+ Arc-C 299 49.49832776
41
+ Arc-E 570 73.85964912
42
+ PPL 512 wikitext : 7.3224 +/- 0.04674
43
+
44
+ ```