patf82
/

Nous-Hermes-2-Yi-34B-IQ3-imatrix-GGUF

Inference Endpoints

Model card Files Files and versions Community

Edit model card

IQ3 quants of NousResearch/Nous-Hermes-2-Yi-34B

Created using llama.cpp 9e359a4f, with default settings of both convert.py and quantize using the imatrix provided by ikawrakow.

See https://github.com/ggerganov/llama.cpp/pull/5676 for information on the IQ3 quantization.

Downloads last month: 7

GGUF

Model size

34.4B params

Architecture

llama

3-bit

Inference API

Unable to determine this model's library. Check the docs .

Model tree for patf82/Nous-Hermes-2-Yi-34B-IQ3-imatrix-GGUF

Base model

01-ai/Yi-34B

Quantized

(8)

this model

Dataset used to train patf82/Nous-Hermes-2-Yi-34B-IQ3-imatrix-GGUF