OPEA
/

Qwen2.5-32B-Instruct-int4-sym-mixed-inc

Model card Files Files and versions Community

cicdatopea commited on 24 days ago

Commit

b7b236d

•

1 Parent(s): aa98c01

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ CPU/ CUDA requires auto-round version>0.3.1
 ```python
 from auto_round import AutoRoundConfig ##must import for auto-round format
 from transformers import AutoModelForCausalLM,AutoTokenizer
-quantized_model_dir = "OPEA/Qwen2.5-32B-Instruct-int4-inc"
 tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
 model = AutoModelForCausalLM.from_pretrained(
@@ -127,7 +127,7 @@ prompt = "请简短介绍一下阿里巴巴公司"
 pip3 install lm-eval==0.4.5
 ```bash
-auto-round --model "OPEA/Qwen2.5-32B-Instruct-int4-inc" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
 ```
 | Metric                                     |  BF16  |  INT4  |
@@ -156,7 +156,7 @@ auto-round --model "OPEA/Qwen2.5-32B-Instruct-int4-inc" --eval --eval_bs 16  --t
 Here is the sample command to generate the model.
-For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version >0.4.1
 ```bash
 auto-round \

 ```python
 from auto_round import AutoRoundConfig ##must import for auto-round format
 from transformers import AutoModelForCausalLM,AutoTokenizer
+quantized_model_dir = "OPEA/Qwen2.5-32B-Instruct-int4-sym-mixed-inc"
 tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
 model = AutoModelForCausalLM.from_pretrained(
 pip3 install lm-eval==0.4.5
 ```bash
+auto-round --model "OPEA/Qwen2.5-32B-Instruct-int4-sym-mixed-inc" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
 ```
 | Metric                                     |  BF16  |  INT4  |
 Here is the sample command to generate the model.
+For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version >=0.4.1
 ```bash
 auto-round \