|
--- |
|
quantized_by: spedrox-sac |
|
license: mit |
|
pipeline_tag: text-generation |
|
base_model: |
|
- meta-llama/Llama-3.2-1B |
|
language: |
|
- en |
|
tags: |
|
- text-generation |
|
- text-model |
|
- quantized_model |
|
--- |
|
|
|
## This model is quantized from [Meta-Llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B). |
|
|
|
# Quantized Llama 3.2-1B |
|
|
|
This repository contains a quantized version of the Llama 3.2-1B model, optimized for reduced memory footprint and faster inference. |
|
|
|
## Quantization Details |
|
|
|
The model has been quantized using GPTQ (Generative Pretrained Transformer Quantization) with the following parameters: |
|
|
|
- **Quantization method:** GPTQ |
|
- **Number of bits:** 4 |
|
- **Dataset used for calibration:** c4 |
|
|
|
## Usage |
|
|
|
To use the quantized model, you can load it using the `load_quantized_model` function from the `optimum.gptq` library: |
|
Make sure to replace `save_folder` with the path to the directory where the quantized model is saved. |
|
|
|
## Requirements |
|
|
|
- Python 3.8 or higher |
|
- PyTorch 2.0 or higher |
|
- Transformers |
|
- Optimum |
|
- Accelerate |
|
- Bitsandbytes |
|
- Auto-GPTQ |
|
|
|
You can install these dependencies using pip. |
|
## Disclaimer |
|
|
|
This quantized model is provided for research and experimentation purposes. While quantization can significantly reduce model size and improve inference speed, it may also result in a slight decrease in accuracy compared to the original model. |
|
|
|
## Acknowledgements |
|
|
|
- Meta AI for releasing the Llama 3.2-1B model. |
|
- The authors of the GPTQ quantization method. |
|
- The Hugging Face team for providing the tools and resources for model sharing and deployment. |
|
|