2024-02-26: Updating quants - IQ3_M/IQ3_S/IQ3_XS and IQ2_M/IQ2_S (requires latest commit a33e6a0d).

Layers Context Template
80
32764
<|im_start|>system
{instructions}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
{response}

New 2/3-bit quantization types

Downloads last month
111
GGUF
Model size
69B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support gguf models with pipeline type text-generation