Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,123 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Genshin Impact Zhongli & Ningguang HunyuanVideo LoRA
|
2 |
+
|
3 |
+
This repository contains the necessary setup and scripts to generate videos using the HunyuanVideo model with a LoRA (Low-Rank Adaptation) fine-tuned for Genshin Impact's characters Zhongli and Ningguang. Below are the instructions to install dependencies, download models, and run the demo.
|
4 |
+
|
5 |
+
---
|
6 |
+
|
7 |
+
## Installation
|
8 |
+
|
9 |
+
### Step 1: Install System Dependencies
|
10 |
+
Run the following command to install required system packages:
|
11 |
+
```bash
|
12 |
+
sudo apt-get update && sudo apt-get install git-lfs ffmpeg cbm
|
13 |
+
```
|
14 |
+
|
15 |
+
### Step 2: Clone the Repository
|
16 |
+
Clone the repository and navigate to the project directory:
|
17 |
+
```bash
|
18 |
+
git clone https://huggingface.co/svjack/Genshin_Impact_ZhongLi_NingGuang_Couple_HunyuanVideo_lora
|
19 |
+
cd Genshin_Impact_ZhongLi_NingGuang_Couple_HunyuanVideo_lora
|
20 |
+
```
|
21 |
+
|
22 |
+
### Step 3: Install Python Dependencies
|
23 |
+
Install the required Python packages:
|
24 |
+
```bash
|
25 |
+
conda create -n py310 python=3.10
|
26 |
+
conda activate py310
|
27 |
+
pip install ipykernel
|
28 |
+
python -m ipykernel install --user --name py310 --display-name "py310"
|
29 |
+
|
30 |
+
pip install -r requirements.txt
|
31 |
+
pip install ascii-magic matplotlib tensorboard huggingface_hub
|
32 |
+
pip install moviepy==1.0.3
|
33 |
+
pip install sageattention==1.0.6
|
34 |
+
```
|
35 |
+
|
36 |
+
---
|
37 |
+
|
38 |
+
## Download Models
|
39 |
+
|
40 |
+
### Step 1: Download HunyuanVideo Model
|
41 |
+
Download the HunyuanVideo model and place it in the `ckpts` directory:
|
42 |
+
```bash
|
43 |
+
huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts
|
44 |
+
```
|
45 |
+
|
46 |
+
### Step 2: Download LLaVA Model
|
47 |
+
Download the LLaVA model and preprocess it:
|
48 |
+
```bash
|
49 |
+
cd ckpts
|
50 |
+
huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers
|
51 |
+
wget https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py
|
52 |
+
python preprocess_text_encoder_tokenizer_utils.py --input_dir llava-llama-3-8b-v1_1-transformers --output_dir text_encoder
|
53 |
+
```
|
54 |
+
|
55 |
+
### Step 3: Download CLIP Model
|
56 |
+
Download the CLIP model for the text encoder:
|
57 |
+
```bash
|
58 |
+
huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2
|
59 |
+
```
|
60 |
+
|
61 |
+
---
|
62 |
+
|
63 |
+
## Demo
|
64 |
+
|
65 |
+
### Generate Video 1: Zhongli & Ningguang Cooking Rice
|
66 |
+
Run the following command to generate a video of Zhongli and Ningguang cooking rice:
|
67 |
+
```bash
|
68 |
+
python hv_generate_video.py \
|
69 |
+
--fp8 \
|
70 |
+
--video_size 544 960 \
|
71 |
+
--video_length 60 \
|
72 |
+
--infer_steps 30 \
|
73 |
+
--prompt "ZHONGLI\\(genshin impact\\) with NING GUANG\\(genshin impact\\) in red cheongsam. cook rice in a pot" \
|
74 |
+
--save_path . \
|
75 |
+
--output_type both \
|
76 |
+
--dit ckpts/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states.pt \
|
77 |
+
--attn_mode sdpa \
|
78 |
+
--vae ckpts/hunyuan-video-t2v-720p/vae/pytorch_model.pt \
|
79 |
+
--vae_chunk_size 32 \
|
80 |
+
--vae_spatial_tile_sample_min_size 128 \
|
81 |
+
--text_encoder1 ckpts/text_encoder \
|
82 |
+
--text_encoder2 ckpts/text_encoder_2 \
|
83 |
+
--seed 1234 \
|
84 |
+
--lora_multiplier 1.0 \
|
85 |
+
--lora_weight zhongli_ningguang_couple_im_lora_dir/zhongli_ningguang_couple_im_lora-000012.safetensors
|
86 |
+
```
|
87 |
+
|
88 |
+
|
89 |
+
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/OHfEzY_Ernp778vMC-K8D.mp4"></video>
|
90 |
+
|
91 |
+
|
92 |
+
### Generate Video 2: Zhongli Drinking Tea
|
93 |
+
Run the following command to generate a video of Zhongli drinking water:
|
94 |
+
```bash
|
95 |
+
python hv_generate_video.py \
|
96 |
+
--fp8 \
|
97 |
+
--video_size 544 960 \
|
98 |
+
--video_length 60 \
|
99 |
+
--infer_steps 30 \
|
100 |
+
--prompt "ZHONGLI\\(genshin impact\\). drink tea" \
|
101 |
+
--save_path . \
|
102 |
+
--output_type both \
|
103 |
+
--dit ckpts/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states.pt \
|
104 |
+
--attn_mode sdpa \
|
105 |
+
--vae ckpts/hunyuan-video-t2v-720p/vae/pytorch_model.pt \
|
106 |
+
--vae_chunk_size 32 \
|
107 |
+
--vae_spatial_tile_sample_min_size 128 \
|
108 |
+
--text_encoder1 ckpts/text_encoder \
|
109 |
+
--text_encoder2 ckpts/text_encoder_2 \
|
110 |
+
--seed 1234 \
|
111 |
+
--lora_multiplier 1.0 \
|
112 |
+
--lora_weight zhongli_ningguang_couple_im_lora_dir/zhongli_ningguang_couple_im_lora-000012.safetensors
|
113 |
+
```
|
114 |
+
|
115 |
+
|
116 |
+
---
|
117 |
+
|
118 |
+
## Notes
|
119 |
+
- Ensure you have sufficient GPU resources for video generation.
|
120 |
+
- Adjust the `--video_size`, `--video_length`, and `--infer_steps` parameters as needed for different output qualities and lengths.
|
121 |
+
- The `--prompt` parameter can be modified to generate videos with different scenes or actions.
|
122 |
+
|
123 |
+
---
|