--- tags: - gptq - 4bit - int4 - gptqmodel - modelcloud --- This model has been quantized using [GPTQModel](https://github.com/ModelCloud/GPTQModel). - **bits**: 4 - **group_size**: 128 - **desc_act**: false - **static_groups**: false - **sym**: true - **lm_head**: false - **damp_percent**: 0.0025 - **damp_auto_increment**: 0.0015 - **true_sequential**: true - **model_name_or_path**: "" - **model_file_base_name**: "model" - **quant_method**: "gptq" - **checkpoint_format**: "gptq" - **meta**: - **quantizer**: "gptqmodel:1.0.3-dev0" ## Example: ```python from transformers import AutoTokenizer from gptqmodel import GPTQModel model_name = "ModelCloud/GRIN-MoE-gptq-4bit" prompt = [ {"role": "system", "content": "You are GRIN-MoE model from microsoft, a helpful assistant."}, {"role": "user", "content": "I am in Shanghai, preparing to visit the natural history museum. Can you tell me the best way to"} ] tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = GPTQModel.from_quantized(model_name, trust_remote_code=True) input_tensor = tokenizer.apply_chat_template(prompt, add_generation_prompt=True, return_tensors="pt") outputs = model.generate(input_ids=input_tensor.to(model.device), max_new_tokens=100) result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True) print(result) ``` ## Lm_eval result: | Tasks | Metric | | GRIN-MoE | GRIN-MoE-gptq-4bit | | ------------------------------------- | ---------- | --- | -------- | ------------------ | | arc_challenge | acc | ↑ | 0.6408 | 0.6425 | | | acc_norm | ↑ | 0.6561 | 0.6587 | | arc_easy | acc | ↑ | 0.8645 | 0.8683 | | | acc_norm | ↑ | 0.8422 | 0.846 | | boolq | acc | ↑ | 0.8820 | 0.8765 | | hellaswag | acc | ↑ | 0.6972 | 0.6891 | | | acc_norm | ↑ | 0.8518 | 0.8486 | | lambada_openai | acc | ↑ | 0.7058 | 0.7068 | | | perplexity | ↓ | 3.4568 | 3.5732 | | mmlu | acc | ↑ | 0.7751 | 0.7706 | | - humanities | acc | ↑ | 0.7394 | 0.7384 | | - formal_logic | acc | ↑ | 0.6429 | 0.6746 | | - high_school_european_history | acc | ↑ | 0.8606 | 0.8364 | | - high_school_us_history | acc | ↑ | 0.9118 | 0.902 | | - high_school_world_history | acc | ↑ | 0.8903 | 0.8734 | | - international_law | acc | ↑ | 0.9256 | 0.9091 | | - jurisprudence | acc | ↑ | 0.8426 | 0.8519 | | - logical_fallacies | acc | ↑ | 0.8344 | 0.8528 | | - moral_disputes | acc | ↑ | 0.7977 | 0.8208 | | - moral_scenarios | acc | ↑ | 0.6961 | 0.6849 | | - philosophy | acc | ↑ | 0.8199 | 0.8071 | | - prehistory | acc | ↑ | 0.8457 | 0.8426 | | - professional_law | acc | ↑ | 0.6173 | 0.6193 | | - world_religions | acc | ↑ | 0.8480 | 0.8655 | | - other | acc | ↑ | 0.8130 | 0.805 | | - business_ethics | acc | ↑ | 0.8100 | 0.78 | | - clinical_knowledge | acc | ↑ | 0.8415 | 0.8302 | | - college_medicine | acc | ↑ | 0.7514 | 0.7457 | | - global_facts | acc | ↑ | 0.5700 | 0.54 | | - human_aging | acc | ↑ | 0.7803 | 0.7668 | | - management | acc | ↑ | 0.8447 | 0.8447 | | - marketing | acc | ↑ | 0.9145 | 0.9103 | | - medical_genetics | acc | ↑ | 0.9200 | 0.89 | | - miscellaneous | acc | ↑ | 0.8966 | 0.8927 | | - nutrition | acc | ↑ | 0.8333 | 0.8268 | | - professional_accounting | acc | ↑ | 0.6489 | 0.656 | | - professional_medicine | acc | ↑ | 0.8750 | 0.8603 | | - virology | acc | ↑ | 0.5422 | 0.5361 | | - social sciences | acc | ↑ | 0.8638 | 0.8544 | | - econometrics | acc | ↑ | 0.5789 | 0.5789 | | - high_school_geography | acc | ↑ | 0.9091 | 0.8788 | | - high_school_government_and_politics | acc | ↑ | 0.9585 | 0.943 | | - high_school_macroeconomics | acc | ↑ | 0.8308 | 0.8103 | | - high_school_microeconomics | acc | ↑ | 0.9328 | 0.9286 | | - high_school_psychology | acc | ↑ | 0.9321 | 0.9303 | | - human_sexuality | acc | ↑ | 0.8779 | 0.8626 | | - professional_psychology | acc | ↑ | 0.8382 | 0.8219 | | - public_relations | acc | ↑ | 0.7545 | 0.7727 | | - security_studies | acc | ↑ | 0.7878 | 0.7918 | | - sociology | acc | ↑ | 0.8905 | 0.8955 | | - us_foreign_policy | acc | ↑ | 0.9000 | 0.88 | | - stem | acc | ↑ | 0.7044 | 0.7031 | | - abstract_algebra | acc | ↑ | 0.5000 | 0.45 | | - anatomy | acc | ↑ | 0.7407 | 0.7481 | | - astronomy | acc | ↑ | 0.8618 | 0.8618 | | - college_biology | acc | ↑ | 0.8889 | 0.875 | | - college_chemistry | acc | ↑ | 0.6100 | 0.59 | | - college_computer_science | acc | ↑ | 0.7100 | 0.67 | | - college_mathematics | acc | ↑ | 0.5100 | 0.58 | | - college_physics | acc | ↑ | 0.4608 | 0.4608 | | - computer_security | acc | ↑ | 0.8200 | 0.82 | | - conceptual_physics | acc | ↑ | 0.7787 | 0.766 | | - electrical_engineering | acc | ↑ | 0.6828 | 0.6828 | | - elementary_mathematics | acc | ↑ | 0.7566 | 0.7593 | | - high_school_biology | acc | ↑ | 0.9000 | 0.9097 | | - high_school_chemistry | acc | ↑ | 0.6650 | 0.665 | | - high_school_computer_science | acc | ↑ | 0.8700 | 0.86 | | - high_school_mathematics | acc | ↑ | 0.4370 | 0.4296 | | - high_school_physics | acc | ↑ | 0.5960 | 0.5894 | | - high_school_statistics | acc | ↑ | 0.7176 | 0.7222 | | - machine_learning | acc | ↑ | 0.6071 | 0.6339 | | openbookqa | acc | ↑ | 0.3920 | 0.386 | | | acc_norm | ↑ | 0.4900 | 0.486 | | piqa | acc | ↑ | 0.8183 | 0.8166 | | | acc_norm | ↑ | 0.8205 | 0.8177 | | rte | acc | ↑ | 0.8014 | 0.7834 | | truthfulqa_mc1 | acc | ↑ | 0.3880 | 0.399 | | winogrande | acc | ↑ | 0.7940 | 0.768 | | | | | | | | Groups | Metric | | Value | Value | | mmlu | acc | ↑ | 0.7751 | 0.7706 | | - humanities | acc | ↑ | 0.7394 | 0.7384 | | - other | acc | ↑ | 0.8130 | 0.805 | | - social sciences | acc | ↑ | 0.8638 | 0.8544 | | - stem | acc | ↑ | 0.7044 | 0.7031 |