yujiepan's picture
Update README.md
f258fb5 verified
|
raw
history blame
649 Bytes
---
library_name: transformers
tags: []
---
# yujiepan/llama-3-tiny-random-gptq-w4
4-bit weight only quantization by AutoGPTQ on [yujiepan/llama-3-tiny-random](https://huggingface.co/yujiepan/llama-3-tiny-random)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig
import torch
model_id = "yujiepan/llama-3-tiny-random"
tokenizer = AutoTokenizer.from_pretrained(model_id)
quantization_config = GPTQConfig(
bits=4, group_size=-1,
dataset="c4",
tokenizer=tokenizer,
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
quantization_config=quantization_config,
)
```