Spaces:
Running
on
Zero
Running
on
Zero
File size: 4,625 Bytes
f0e9666 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
<div align="center">
<h1>
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
</h1>
<div>
<a href='https://github.com/CSRuiXie' target='_blank'>Rui Xie<sup>1*</sup></a>, 
<a href='https://github.com/yhliu04' target='_blank'>Yinhong Liu<sup>1*</sup></a>, 
<a href='https://scholar.google.com/citations?user=Uhp3JKgAAAAJ&hl=zh-CN&oi=sra' target='_blank'>Chen Zhao<sup>1</sup></a>, 
<a href='https://scholar.google.com/citations?hl=zh-CN&user=yWq1Fd4AAAAJ' target='_blank'>Penghao Zhou<sup>2</sup></a>, 
<a href='https://scholar.google.com/citations?hl=zh-CN&user=Ds5wwRoAAAAJ' target='_blank'>Zhenheng Yang<sup>2</sup></a><br>
<a href='https://scholar.google.com/citations?hl=zh-CN&user=w03CHFwAAAAJ' target='_blank'>Jun Zhou<sup>3</sup></a>, 
<a href='https://cszn.github.io/' target='_blank'>Kai Zhang<sup>1</sup></a>, 
<a href='https://jessezhang92.github.io/' target='_blank'>Zhenyu Zhang<sup>1</sup></a>, 
<a href='https://scholar.google.com.hk/citations?user=6CIDtZQAAAAJ&hl=zh-CN' target='_blank'>Jian Yang<sup>1</sup></a>, 
<a href='https://tyshiwo.github.io/index.html' target='_blank'>Ying Tai<sup>1†</sup></a>
</div>
<div>
<sup>1</sup>Nanjing University, <sup>2</sup>ByteDance,  <sup>3</sup>Southwest University
</div>
<div>
<h4 align="center">
<a href="https://nju-pcalab.github.io/projects/STAR" target='_blank'>
<img src="https://img.shields.io/badge/π-Project%20Page-blue">
</a>
<a href="https://arxiv.org/abs/2407.07667" target='_blank'>
<img src="https://img.shields.io/badge/arXiv-2312.06640-b31b1b.svg">
</a>
<a href="https://youtu.be/hx0zrql-SrU" target='_blank'>
<img src="https://img.shields.io/badge/Demo%20Video-%23FF0000.svg?logo=YouTube&logoColor=white">
</a>
</h4>
</div>
</div>
### π Updates
- **2024.12.01** The pretrained STAR model (I2VGen-XL version) and inference code have been released.
## π Method Overview
![STAR](assets/overview.png)
## π· Results Display
![STAR](assets/teaser.png)
![STAR](assets/real_world.png)
π More visual results can be found in our [Project Page](https://nju-pcalab.github.io/projects/STAR) and [Video Demo](https://youtu.be/hx0zrql-SrU).
## βοΈ Dependencies and Installation
```
## git clone this repository
git clone https://github.com/NJU-PCALab/STAR.git
cd STAR
## create an environment
conda create -n star python=3.10
conda activate star
pip install -r requirements.txt
sudo apt-get update && apt-get install ffmpeg libsm6 libxext6 -y
```
## π Inference
#### Step 1: Download the pretrained model STAR from [HuggingFace](https://huggingface.co/SherryX/STAR).
We provide two verisions, `heavy_deg.pt` for heavy degraded videos and `light_deg.pt` for light degraded videos (e.g., the low-resolution video downloaded from video websites).
You can put the weight into `pretrained_weight/`.
#### Step 2: Prepare testing data
You can put the testing videos in the `input/video/`.
As for the prompt, there are three options: 1. No prompt. 2. Automatically generate a prompt [using Pllava](https://github.com/hpcaitech/Open-Sora/tree/main/tools/caption#pllava-captioning). 3. Manually write the prompt. You can put the txt file in the `input/text/`.
#### Step 3: Change the path
You need to change the paths in `video_super_resolution/scripts/inference_sr.sh` to your local corresponding paths, including `video_folder_path`, `txt_file_path`, `model_path`, and `save_dir`.
#### Step 4: Running inference command
```
bash video_super_resolution/scripts/inference_sr.sh
```
## β€οΈ Acknowledgments
This project is based on [I2VGen-XL](https://github.com/ali-vilab/VGen), [VEnhancer](https://github.com/Vchitect/VEnhancer) and [CogVideoX](https://github.com/THUDM/CogVideo). Thanks for their awesome works.
## πCitations
If our project helps your research or work, please consider citing our paper:
```
@misc{xie2024addsr,
title={AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation},
author={Rui Xie and Ying Tai and Kai Zhang and Zhenyu Zhang and Jun Zhou and Jian Yang},
year={2024},
eprint={2404.01717},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
## π§ Contact
If you have any inquiries, please don't hesitate to reach out via email at `ruixie0097@gmail.com`
|