---
license: other
license_name: tencent-hunyuan-community
license_link: LICENSE
pipeline_tag: text-to-image
library_name: transformers
language:
- zh
- en
tasks:
- text-to-image-synthesis
frameworks: PyTorch
base_model:
- tencent/HunyuanImage-3.0
base_model_relation: quantized
---
==================================================================================
本模型为 https://huggingface.co/tencent/HunyuanImage-3.0 模型的 qint4 量化版本,采用 https://github.com/huggingface/optimum-quanto 技术量化,采用非官方技术保存的权重文件。
本量化模型目前在 H20 96GB 单卡上通过测试。模型加载方式,采用非官方代码,详见 load_quantized_model.py 代码,目前里面包含两种加载方式,供大家参考,欢迎大家相互交流、共同研究学习,谢谢!
加载方式一:模型初始化加载需要 CPU 大约 160GB 左右,GPU 初始占用 50GB;推理开始后 CPU 占用降至 70GB 左右,GPU 占用约 55-60 GB。模型加载时会出现模型键值的警告信息,但不影响使用。
加载方式二:模型初始化加载需要 CPU 大约 75GB,GPU 初始占用 50GB;推理开始后 CPU 保持 75GB 占用, GPU 占用约 55-60GB。模型加载时,由于提供了键值 map , 所以不会出现任何警告信息。
两种方法推理时间大致相同,在 H20 上大约 12 分钟一张(9:16 / 16:9)。
==================================================================================
### HunyuanImage-3.0 是一个非常出色的全模态混合专家模型!以下介绍内容引用自官方原模型介绍页。本项目提供的模型和代码仅用于社区分享和技术研究学习使用,请遵守腾讯混元官方的 License 相关规定。
==================================================================================
# 🎨 HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
👏 Join our WeChat and Discord | 💻 Official website(官网) Try our model!  
## 🔥🔥🔥 News - **September 28, 2025**: 📖 **HunyuanImage-3.0 Technical Report Released** - Comprehensive technical documentation now available - **September 28, 2025**: 🚀 **HunyuanImage-3.0 Open Source Release** - Inference code and model weights publicly available ## 🧩 Community Contributions If you develop/use HunyuanImage-3.0 in your projects, welcome to let us know. ## 📑 Open-source Plan - HunyuanImage-3.0 (Image Generation Model) - [x] Inference - [x] HunyuanImage-3.0 Checkpoints - [ ] HunyuanImage-3.0-Instruct Checkpoints (with reasoning) - [ ] VLLM Support - [ ] Distilled Checkpoints - [ ] Image-to-Image Generation - [ ] Multi-turn Interaction ## 🗂️ Contents - [🔥🔥🔥 News](#-news) - [🧩 Community Contributions](#-community-contributions) - [📑 Open-source Plan](#-open-source-plan) - [📖 Introduction](#-introduction) - [✨ Key Features](#-key-features) - [🛠️ Dependencies and Installation](#-dependencies-and-installation) - [💻 System Requirements](#-system-requirements) - [📦 Environment Setup](#-environment-setup) - [📥 Install Dependencies](#-install-dependencies) - [Performance Optimizations](#performance-optimizations) - [🚀 Usage](#-usage) - [🔥 Quick Start with Transformers](#-quick-start-with-transformers) - [🏠 Local Installation & Usage](#-local-installation--usage) - [🎨 Interactive Gradio Demo](#-interactive-gradio-demo) - [🧱 Models Cards](#-models-cards) - [📝 Prompt Guide](#-prompt-guide) - [Manually Writing Prompts](#manually-writing-prompts) - [System Prompt For Automatic Rewriting the Prompt](#system-prompt-for-automatic-rewriting-the-prompt) - [Advanced Tips](#advanced-tips) - [More Cases](#more-cases) - [📊 Evaluation](#-evaluation) - [📚 Citation](#-citation) - [🙏 Acknowledgements](#-acknowledgements) - [🌟🚀 Github Star History](#-github-star-history) --- ## 📖 Introduction **HunyuanImage-3.0** is a groundbreaking native multimodal model that unifies multimodal understanding and generation within an autoregressive framework. Our text-to-image module achieves performance **comparable to or surpassing** leading closed-source models.