File size: 1,042 Bytes
11a4136
 
 
 
 
0be2dae
fb52506
0be2dae
 
 
 
 
dd7f181
8750c44
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
license: gemma
base_model:
- ifable/gemma-2-Ifable-9B
pipeline_tag: text-generation
---
## 🛑 Note: not every quant is displayed on the table on the right, you can find everything [here](https://huggingface.co/Hampetiudo/gemma-2-Ifable-9B-i1-GGUF/tree/main).

Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3804">b3804</a> for quantization.

Original model: https://huggingface.co/ifable/gemma-2-Ifable-9B

All quants were made using the imatrix option (except BF16, that's the original precision). The imatrix was generated with the dataset from [here](https://gist.github.com/tristandruyen/9e207a95c7d75ddf37525d353e00659c), using the BF16 GGUF with a context size of 8192 tokens (default is 512 but higher/same as model context size should improve quality) and 13 chunks.

How to make your own quants:

https://github.com/ggerganov/llama.cpp/tree/master/examples/imatrix

https://github.com/ggerganov/llama.cpp/tree/master/examples/quantize