quantized_by: spedrox-sac
license: mit
pipeline_tag: text-generation
base_model:
- meta-llama/Llama-3.2-1B
language:
- en
tags:
- text-generation
- text-model
- quantized_model
This model is quantized from Meta-Llama/Llama-3.2-1B.
Quantized Llama 3.2-1B
This repository contains a quantized version of the Llama 3.2-1B model, optimized for reduced memory footprint and faster inference.
Quantization Details
The model has been quantized using GPTQ (Generative Pretrained Transformer Quantization) with the following parameters:
- Quantization method: GPTQ
- Number of bits: 4
- Dataset used for calibration: c4
Usage
To use the quantized model, you can load it using the load_quantized_model
function from the optimum.gptq
library:
Make sure to replace save_folder
with the path to the directory where the quantized model is saved.
Requirements
- Python 3.8 or higher
- PyTorch 2.0 or higher
- Transformers
- Optimum
- Accelerate
- Bitsandbytes
- Auto-GPTQ
You can install these dependencies using pip:
Disclaimer
This quantized model is provided for research and experimentation purposes. While quantization can significantly reduce model size and improve inference speed, it may also result in a slight decrease in accuracy compared to the original model.
Acknowledgements
- Meta AI for releasing the Llama 3.2-1B model.
- The authors of the GPTQ quantization method.
- The Hugging Face team for providing the tools and resources for model sharing and deployment.