X-Portrait

Paused

App Files Files Community

X-Portrait / ORIGINAL_README.md

fffiloni

Migrated from GitHub

bfed184 verified 15 days ago

preview code

raw

history blame contribute delete

3.12 kB

	<!-- # magic-edit.github.io -->

	<p align="center">

	<h2 align="center">X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention</h2>
	<p align="center">
	<a href="https://scholar.google.com/citations?user=FV0eXhQAAAAJ&hl=en">You Xie</a>,
	<a href="https://hongyixu37.github.io/homepage/">Hongyi Xu</a>,
	<a href="https://guoxiansong.github.io/homepage/index.html">Guoxian Song</a>,
	<a href="https://chaowang.info/">Chao Wang</a>,
	<a href="https://seasonsh.github.io/">Yichun Shi</a>,
	<a href="http://linjieluo.com/">Linjie Luo</a>
	<br>
	<b>  ByteDance Inc. </b>
	<br>
	<br>
	<a href="https://arxiv.org/abs/2403.15931"><img src='https://img.shields.io/badge/arXiv-X--Portrait-red' alt='Paper PDF'></a>
	<a href='https://byteaigc.github.io/x-portrait/'><img src='https://img.shields.io/badge/Project_Page-X--Portrait-green' alt='Project Page'></a>
	<a href='https://youtu.be/VGxt5XghRdw'>
	<img src='https://img.shields.io/badge/YouTube-X--Portrait-rgb(255, 0, 0)' alt='Youtube'></a>
	<br>
	</p>

	<table align="center">
	<tr>
	<td>
	<img src="assets/teaser/teaser.png">
	</td>
	</tr>
	</table>

	This repository contains the video generation code of SIGGRAPH 2024 paper [X-Portrait](https://arxiv.org/pdf/2403.15931).

	## Installation
	Note: Python 3.9 and Cuda 11.8 are required.
	```shell
	bash env_install.sh
	```

	## Model
	Please download pre-trained model from [here](https://drive.google.com/drive/folders/1Bq0n-w1VT5l99CoaVg02hFpqE5eGLo9O?usp=sharing), and save it under "checkpoint/"

	## Testing
	```shell
	bash scripts/test_xportrait.sh
	```
	parameters:
	model_config: config file of the corresponding model
	output_dir: output path for generated video
	source_image: path of source image
	driving_video: path of driving video
	best_frame: specify the frame index in the driving video where the head pose best matches the source image (note: precision of best_frame index might affect the final quality)
	out_frames: number of generation frames
	num_mix: number of overlapping frames when applying prompt travelling during inference
	ddim_steps: number of inference steps (e.g., 30 steps for ddim)

	## Performance Boost
	efficiency: Our model is compatible with LCM LoRA (https://huggingface.co/latent-consistency/lcm-lora-sdv1-5), which helps reduce the number of inference steps.
	expressiveness: Expressiveness of the results could be boosted if results of other face reenactment approaches, e.g., face vid2vid, could be provided via parameter "--initial_facevid2vid_results".

	## 🎓 Citation
	If you find this codebase useful for your research, please use the following entry.
	```BibTeX
	@inproceedings{xie2024x,
	title={X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention},
	author={Xie, You and Xu, Hongyi and Song, Guoxian and Wang, Chao and Shi, Yichun and Luo, Linjie},
	journal={arXiv preprint arXiv:2403.15931},
	year={2024}
	}
	```