amd
/

Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-fp16-onnx-hybrid

Model card Files Files and versions Community

uday610 commited on 16 days ago

Commit

e1aa29b

•

1 Parent(s): b6634c1

Create README.md

Files changed (1) hide show

README.md +39 -0

README.md ADDED Viewed

	@@ -0,0 +1,39 @@

+---
+license: apache-2.0
+base_model:
+- mistralai/Mistral-7B-Instruct-v0.3
+---
+# mistralai/Mistral-7B-Instruct-v0.3
+- ## Introduction
+  - Quantization Tool: Quark 0.6.0
+  - OGA Model Builder: v0.5.1
+- ## Quantization Strategy
+  - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
+  - Excluded Layers: None
+  ```
+  python3 quantize_quark.py \
+        --model_dir "$model" \
+        --output_dir "$output_dir" \
+        --quant_scheme w_uint4_per_group_asym \
+        --num_calib_data 128 \
+        --quant_algo awq \
+        --dataset pileval_for_awq_benchmark \
+        --seq_len 512 \
+        --model_export quark_safetensors \
+        --data_type float16 \
+        --exclude_layers [] \
+        --custom_mode awq
+  ```
+- ## OGA Model Builder
+  ```
+  python builder.py \
+    -i <quantized safetensor model dir> \
+    -o <oga model output dir> \
+    -p int4 \
+    -e dml
+  ```
+- PostProcessed to generate Hybrid Model
+-