File size: 977 Bytes
85db336
f190fb9
 
 
85db336
f190fb9
 
 
 
85db336
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: mit
language:
- en
base_model:
- Qwen/Qwen1.5-7B-Chat
pipeline_tag: text-generation
tags:
- chat
---


# mistralai/Mistral-7B-Instruct-v0.3
- ## Introduction
  - Quantization Tool: Quark 0.6.0
  - OGA Model Builder: v0.5.1    
- ## Quantization Strategy
  - AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
  - Excluded Layers: None
  ```
  python3 quantize_quark.py \
        --model_dir "$model" \
        --output_dir "$output_dir" \
        --quant_scheme w_uint4_per_group_asym \
        --num_calib_data 128 \
        --quant_algo awq \
        --dataset pileval_for_awq_benchmark \
        --seq_len 512 \
        --model_export quark_safetensors \
        --data_type float16 \
        --exclude_layers [] \
        --custom_mode awq
  ```
- ## OGA Model Builder
  ```
  python builder.py \
    -i <quantized safetensor model dir> \
    -o <oga model output dir> \
    -p int4 \
    -e dml
  ```
- PostProcessed to generate Hybrid Model
-