Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Generated using autoawq: pip install git+https://github.com/casper-hansen/AutoAWQ.git@f0321eedca887c12680553fc561d176b03b1b9a5 flash_attn

Following code used for generation:

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = 'models/Phi-3-medium-128k-instruct'
quant_path = 'models/Phi-3-medium-128k-instruct-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

# Load model
model = AutoAWQForCausalLM.from_pretrained(model_path, **{"device_map": "auto"})
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Quantize
model.quantize(tokenizer, quant_config=quant_config)

# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

Original model here: https://huggingface.co/microsoft/Phi-3-medium-128k-instruct


license: mit

Downloads last month
52
Safetensors
Model size
2.15B params
Tensor type
I32
·
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.