|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# yujiepan/llama-3-tiny-random-gptq-w4 |
|
|
|
4-bit weight only quantization by AutoGPTQ on [yujiepan/llama-3-tiny-random](https://huggingface.co/yujiepan/llama-3-tiny-random) |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig |
|
import torch |
|
|
|
model_id = "yujiepan/llama-3-tiny-random" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
quantization_config = GPTQConfig( |
|
bits=4, group_size=-1, |
|
dataset="c4", |
|
tokenizer=tokenizer, |
|
) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
device_map="auto", |
|
quantization_config=quantization_config, |
|
) |
|
``` |
|
|
|
|