Update README.md
Browse filesFixed typo from INT4 to INT8 as this is an INT8 model
README.md
CHANGED
|
@@ -12,14 +12,14 @@ base_model: Qwen/Qwen2.5-VL-72B-Instruct
|
|
| 12 |
library_name: transformers
|
| 13 |
---
|
| 14 |
|
| 15 |
-
# Qwen2.5-VL-72B-Instruct-quantized-
|
| 16 |
|
| 17 |
## Model Overview
|
| 18 |
- **Model Architecture:** Qwen/Qwen2.5-VL-72B-Instruct
|
| 19 |
- **Input:** Vision-Text
|
| 20 |
- **Output:** Text
|
| 21 |
- **Model Optimizations:**
|
| 22 |
-
- **Weight quantization:**
|
| 23 |
- **Activation quantization:** FP16
|
| 24 |
- **Release Date:** 2/24/2025
|
| 25 |
- **Version:** 1.0
|
|
|
|
| 12 |
library_name: transformers
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# Qwen2.5-VL-72B-Instruct-quantized-w8a8
|
| 16 |
|
| 17 |
## Model Overview
|
| 18 |
- **Model Architecture:** Qwen/Qwen2.5-VL-72B-Instruct
|
| 19 |
- **Input:** Vision-Text
|
| 20 |
- **Output:** Text
|
| 21 |
- **Model Optimizations:**
|
| 22 |
+
- **Weight quantization:** INT8
|
| 23 |
- **Activation quantization:** FP16
|
| 24 |
- **Release Date:** 2/24/2025
|
| 25 |
- **Version:** 1.0
|