VISOR-GPT / README.md
szukevin's picture
upload
7900c16
|
raw
history blame
2.15 kB

Learning Visual Prior via Generative Pre-Training [Arxiv] [Demo] [Video]

Updates

Quick Start

Step 1

# clone the repo
git clone https://github.com/Sierkinhane/VisorGPT.git

# go to directory
cd VisorGPT

# create a new environment
conda create -n visorgpt python=3.8

# activate the new environment
conda activate visorgpt

# prepare the basic environments
pip3 install -r requirements.txt

# install controlnet and gligen
cd demo/ControlNet
pip3 install -v -e .
cd ../demo/GLIGEN
pip3 install -v -e .

Step 2 - Download pre-trained weights

Download visorgpt, controlnet-pose2img, controlnet-sd, gligen-bbox2img, and put them as follow:

β”œβ”€β”€ demo/
|   β”œβ”€β”€ ckpts
|   |   β”œβ”€β”€ controlnet
|   |   |   β”œβ”€β”€ control_v11p_sd15_openpose.pth
|   |   |   β”œβ”€β”€ v1-5-pruned-emaonly.safetensors
|   |   β”œβ”€β”€ gligen
|   |   |   β”œβ”€β”€ diffusion_pytorch_model_box.bin
|   |   β”œβ”€β”€ visorgpt
|   |   |   β”œβ”€β”€ visorgpt_dagger_ta_tb.pt

Step 3 - Run demo

CUDA_VISIBLE_DEVICES=0 python3 gradio_demo.py

If you are using our code, please consider citing our paper.

@article{xie2023visorgpt,
  title={VisorGPT: Learning Visual Prior via Generative Pre-Training},
  author={Xie, Jinheng and Ye, Kai and Li, Yudong and Li, Yuexiang and Lin, Kevin Qinghong and Zheng, Yefeng and Shen, Linlin and Shou, Mike Zheng},
  journal={arXiv preprint arXiv:2305.13777},
  year={2023}
}