README.md · Hampetiudo/gemma-2-Ifable-9B-i1-GGUF at 7138594f8dce54270e6174d5adca3bf1606a4f9b

metadata

license: gemma
base_model:
  - ifable/gemma-2-Ifable-9B
pipeline_tag: text-generation

Llama.cpp imatrix quants of gemma-2-Ifable-9B

Using llama.cpp release b3804 for quantization.

Original model: https://huggingface.co/ifable/gemma-2-Ifable-9B

Both quants were made using the imatrix option. The imatrix was generated with the dataset from here, using the BF16 GGUF with a context size of 8192 tokens (default is 512 but higher/same as model context size should improve quality) and 13 chunks.