# Qwen2-VL-2B-Instruct 4-bit Quantized | |
This is a 4-bit quantized version of the Qwen2-VL-2B-Instruct model. | |
## Model Description | |
- **Original Model**: Qwen/Qwen2-VL-2B-Instruct | |
- **Quantization**: 4-bit quantization using bitsandbytes | |
- **Usage**: This model is optimized for memory efficiency while maintaining performance | |
- **License**: Same as original model | |
## Usage | |
```python | |
from transformers import Qwen2VLModel, AutoTokenizer | |
import torch | |
model = Qwen2VLModel.from_pretrained("ksukrit/qwen2-vl-2b-4bit", trust_remote_code=True, device_map="auto") | |
tokenizer = AutoTokenizer.from_pretrained("ksukrit/qwen2-vl-2b-4bit", trust_remote_code=True) | |
``` | |
## Quantization Details | |
- Quantization Method: bitsandbytes 4-bit quantization | |
- Compute dtype: float16 | |
- Uses double quantization: True | |
- Quantization type: nf4 | |