|
--- |
|
base_model: google/gemma-2-9b-it |
|
inference: false |
|
license: gemma |
|
model_name: Gemma-2-9B-it-4Bit-GPTQ |
|
pipeline_tag: text-generation |
|
quantized_by: qilowoq |
|
tags: |
|
- gptq |
|
language: |
|
- en |
|
- ru |
|
--- |
|
|
|
# Gemma-2-2B-it-4Bit-GPTQ |
|
|
|
- Original Model: [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) |
|
- Model Creator: [google](https://huggingface.co/google) |
|
|
|
## Quantization |
|
|
|
- This model was quantized with the Auto-GPTQ library and dataset containing english and russian wikipedia articles. It has lower perplexity on russian data then other GPTQ models. |
|
|
|
| Model | bits | Perplexity (russian wiki) | |
|
[gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) | 16bit | 6.2152 | |
|
[Granther/Gemma-2-9B-Instruct-4Bit-GPTQ](https://huggingface.co/Granther/Gemma-2-9B-Instruct-4Bit-GPTQ) | 4bit | 6.4966 | |
|
this model | 4bit | 6.3593 | |
|
|