amd
/

Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-fp16-onnx-hybrid

Model card Files Files and versions Community

Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-fp16-onnx-hybrid / README.md

uday610's picture

Update README.md

e9d195f verified 20 days ago

|

956 Bytes

	---
	license: apache-2.0
	base_model:
	- mistralai/Mistral-7B-Instruct-v0.3
	---


	# mistralai/Mistral-7B-Instruct-v0.3
	- ## Introduction
	- Quantization Tool: Quark 0.6.0
	- OGA Model Builder: v0.5.1
	- Postprocess
	- ## Quantization Strategy
	- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
	- Excluded Layers: None
	```
	python3 quantize_quark.py \
	--model_dir "$model" \
	--output_dir "$output_dir" \
	--quant_scheme w_uint4_per_group_asym \
	--num_calib_data 128 \
	--quant_algo awq \
	--dataset pileval_for_awq_benchmark \
	--seq_len 512 \
	--model_export quark_safetensors \
	--data_type float16 \
	--exclude_layers [] \
	--custom_mode awq
	```
	- ## OGA Model Builder
	```
	python builder.py \
	-i <quantized safetensor model dir> \
	-o <oga model output dir> \
	-p int4 \
	-e dml
	```
	- PostProcessed to generate Hybrid Model
	-