pooja-ganesh commited on
Commit
38a7a9d
·
verified ·
1 Parent(s): 5cceb63

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ tags:
6
+ - glm
7
+ - chatglm
8
+ - thudm
9
+ base_model: THUDM/chatglm3-6b
10
+ ---
11
+
12
+ # chatglm3-6b-awq-w-int4-asym-gs128-lmhead-a-fp16-onnx-ryzen-strix-hybrid
13
+ - ## Introduction
14
+ This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset, and applying [onnxruntime-genai model builder](https://github.com/microsoft/onnxruntime-genai/tree/main/src/python/py/models) to convert to ONNX.
15
+ - ## Quantization Strategy
16
+ - ***Quantized Layers***: All linear layers, including "transformer.output_layer"
17
+ - ***Weight***: uint4 asymmetric per-group, with group size 128
18
+ - AWQ / Group 128 / Asymmetric / FP16 activations / INT4 weights
19
+ - ## Quick Start
20
+ For quickstart, refer to AMD [RyzenAI-SW-EA](https://account.amd.com/en/member/ryzenai-sw-ea.html) (to be updated)
21
+
22
+ ## Evaluation
23
+ Quark currently uses perplexity(PPL) as the evaluation metric for accuracy loss before and after quantization.The specific PPL algorithm can be referenced in the quantize_quark.py.
24
+ The quantization evaluation results are conducted in pseudo-quantization mode, which may slightly differ from the actual quantized inference accuracy. These results are provided for reference only.
25
+
26
+ #### License
27
+ Modifications copyright(c) 2024 Advanced Micro Devices,Inc. All rights reserved.
28
+
29
+ Licensed under the Apache License, Version 2.0 (the "License");
30
+ you may not use this file except in compliance with the License.
31
+ You may obtain a copy of the License at
32
+
33
+ http://www.apache.org/licenses/LICENSE-2.0
34
+
35
+ Unless required by applicable law or agreed to in writing, software
36
+ distributed under the License is distributed on an "AS IS" BASIS,
37
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
38
+ See the License for the specific language governing permissions and
39
+ limitations under the License.