mlabonne
/

gpt2-GPTQ-4bit

Text Generation

Inference Endpoints

Model card Files Files and versions Community

gpt2-GPTQ-4bit / README.md

mlabonne's picture

Create README.md

1a570cf over 1 year ago

|

945 Bytes

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- AutoGPTQ
	- 4bit
	- GPTQ
	---

	Model created using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) on a [GPT-2](https://huggingface.co/gpt2) model with 4-bit quantization.

	You can load this model with the AutoGPTQ library, installed with the following command:

	```
	pip install auto-gptq
	```

	You can then download the model from the hub using the following code:

	```python
	from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
	from transformers import AutoTokenizer

	model_id = "mlabonne/gpt2-GPTQ-4bit"
	quantize_config = BaseQuantizeConfig(bits=4, group_size=128)
	model = AutoGPTQForCausalLM.from_pretrained(model_id, quantize_config)
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	```

	This model works with the traditional [Text Generation pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.TextGenerationPipeline).