uday610's picture
Update README.md
f190fb9 verified
|
raw
history blame
977 Bytes
---
license: mit
language:
- en
base_model:
- Qwen/Qwen1.5-7B-Chat
pipeline_tag: text-generation
tags:
- chat
---
# mistralai/Mistral-7B-Instruct-v0.3
- ## Introduction
- Quantization Tool: Quark 0.6.0
- OGA Model Builder: v0.5.1
- ## Quantization Strategy
- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
- Excluded Layers: None
```
python3 quantize_quark.py \
--model_dir "$model" \
--output_dir "$output_dir" \
--quant_scheme w_uint4_per_group_asym \
--num_calib_data 128 \
--quant_algo awq \
--dataset pileval_for_awq_benchmark \
--seq_len 512 \
--model_export quark_safetensors \
--data_type float16 \
--exclude_layers [] \
--custom_mode awq
```
- ## OGA Model Builder
```
python builder.py \
-i <quantized safetensor model dir> \
-o <oga model output dir> \
-p int4 \
-e dml
```
- PostProcessed to generate Hybrid Model
-