|
--- |
|
license: apache-2.0 |
|
--- |
|
# VFIMamba: Video Frame Interpolation with State Space Models |
|
This is the official checkpoint library for [VFIMamba: Video Frame Interpolation with State Space Models](https://arxiv.org/abs/2407.02315). |
|
Please refer to [this repository](https://github.com/MCG-NJU/VFIMamba) for our code. |
|
|
|
## Model Description |
|
VFIMamba is the first approach to adapt the SSM model to the video frame interpolation task. |
|
1. We devise the Mixed-SSM Block (MSB) for efficient inter-frame modeling using S6. |
|
2. We explore various rearrangement methods to convert two frames into a sequence, discovering that interleaved rearrangement is more suitable for VFI tasks. |
|
3. We propose a curriculum learning strategy to further leverage the potential of the S6 model. |
|
|
|
Experimental results demonstrate that VFIMamba achieves the state-of-the-art performance across various datasets, in particular highlighting the potential of the SSM model for VFI tasks with high resolution. |
|
|
|
## Usage |
|
We provide two models, an efficient version (VFIMamba-S) and a stronger one (VFIMamba). You can choose what you need by specifying the parameter model. |
|
|
|
### Manually Load |
|
Please refer to [the instruction here](https://github.com/MCG-NJU/VFIMamba/tree/main?tab=readme-ov-file#sunglassesplay-with-demos) for manually loading the checkpoints and a more customized experience. |
|
```bash |
|
python demo_2x.py --model **model[VFIMamba_S/VFIMamba]** # for 2x interpolation |
|
python demo_Nx.py --n 8 --model **model[VFIMamba_S/VFIMamba]** # for 8x interpolation |
|
``` |
|
|
|
|
|
### Hugging Face Demo |
|
For Hugging Face demo, please refer to [the code here](https://github.com/MCG-NJU/VFIMamba/blob/main/hf_demo_2x.py). |
|
```bash |
|
python hf_demo_2x.py --model **model[VFIMamba_S/VFIMamba]** # for 2x interpolation |
|
``` |
|
|
|
|
|
## Citation |
|
If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper: |
|
``` |
|
@misc{zhang2024vfimambavideoframeinterpolation, |
|
title={VFIMamba: Video Frame Interpolation with State Space Models}, |
|
author={Guozhen Zhang and Chunxu Liu and Yutao Cui and Xiaotong Zhao and Kai Ma and Limin Wang}, |
|
year={2024}, |
|
eprint={2407.02315}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CV}, |
|
url={https://arxiv.org/abs/2407.02315}, |
|
} |
|
``` |
|
|
|
|