For this quantization, we used 1 codebook of 16 bits.

Results (measured with lm_eval==4.0):

Model	Quantization	MMLU (5-shot)	ArcC	ArcE	Hellaswag	Winogrande	PiQA	Model size, Gb
meta-llama/Meta-Llama-3-70B	-	0.7980	0.6160	0.8624	0.6367	0.8183	0.7632	141.2
	1x16	0.7587	0.4863	0.7668	0.6159	0.7481	0.7537	21.9

Downloads last month: 424

Safetensors

Model size

11B params

Tensor type

FP16

I16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for ISTA-DASLab/Meta-Llama-3-70B-Instruct-AQLM-2Bit-1x16

Finetunes

4 models

Quantizations

3 models

Collection including ISTA-DASLab/Meta-Llama-3-70B-Instruct-AQLM-2Bit-1x16

AQLM

Collection

AQLM quantized LLMs • 20 items • Updated Dec 18, 2024 • 46