uday610's picture
Update README.md
4a10692 verified
|
raw
history blame
1.03 kB
---
license: llama2
language:
- en
pipeline_tag: text-generation
tags:
- llama
- llama2
- amd
- meta
- facebook
- onnx
base_model:
- meta-llama/Llama-2-7b-hf
---
# meta-llama/Llama-2-7b-hf
- ## Introduction
- Quantization Tool: Quark 0.6.0
- OGA Model Builder: v0.5.1
- Postprocess
- ## Quantization Strategy
- AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
- Excluded Layers: None
```
python3 quantize_quark.py \
--model_dir "$model" \
--output_dir "$output_dir" \
--quant_scheme w_uint4_per_group_asym \
--num_calib_data 128 \
--quant_algo awq \
--dataset pileval_for_awq_benchmark \
--seq_len 512 \
--model_export quark_safetensors \
--data_type float16 \
--exclude_layers [] \
--custom_mode awq
```
- ## OGA Model Builder
```
python builder.py \
-i <quantized safetensor model dir> \
-o <oga model output dir> \
-p int4 \
-e dml
```
- PostProcessed to generate Hybrid Model
-