Spaces:
Running
Running
File size: 5,547 Bytes
d41cdad |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
<div align="center">
# \[CVPR'24\] Code release for OmniGlue(ONNX)
[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/Realcat/image-matching-webui)
<p align="center">
<a href="https://hwjiang1510.github.io/">Hanwen Jiang</a>,
<a href="https://scholar.google.com/citations?user=jgSItF4AAAAJ">Arjun Karpur</a>,
<a href="https://scholar.google.com/citations?user=7EeSOcgAAAAJ">Bingyi Cao</a>,
<a href="https://www.cs.utexas.edu/~huangqx/">Qixing Huang</a>,
<a href="https://andrefaraujo.github.io/">Andre Araujo</a>
</p>
</div>
--------------------------------------------------------------------------------
<div align="center">
<a href="https://hwjiang1510.github.io/OmniGlue/"><strong>Project Page</strong></a> |
<a href="https://arxiv.org/abs/2405.12979"><strong>Paper</strong></a> |
<a href="#installation"><strong>Usage</strong></a> |
<a href="https://huggingface.co/spaces/qubvel-hf/omniglue"><strong>Demo</strong></a>
</div>
<br>
ONNX-compatible release for the CVPR 2024 paper: **OmniGlue: Generalizable Feature
Matching with Foundation Model Guidance**.
![og_diagram.png](res/og_diagram.png "og_diagram.png")
**Abstract:** The image matching field has been witnessing a continuous
emergence of novel learnable feature matching techniques, with ever-improving
performance on conventional benchmarks. However, our investigation shows that
despite these gains, their potential for real-world applications is restricted
by their limited generalization capabilities to novel image domains. In this
paper, we introduce OmniGlue, the first learnable image matcher that is designed
with generalization as a core principle. OmniGlue leverages broad knowledge from
a vision foundation model to guide the feature matching process, boosting
generalization to domains not seen at training time. Additionally, we propose a
novel keypoint position-guided attention mechanism which disentangles spatial
and appearance information, leading to enhanced matching descriptors. We perform
comprehensive experiments on a suite of 6 datasets with varied image domains,
including scene-level, object-centric and aerial images. OmniGlue’s novel
components lead to relative gains on unseen domains of 18.8% with respect to a
directly comparable reference model, while also outperforming the recent
LightGlue method by 10.1% relatively.
## Installation
First, use pip to install `omniglue`:
```sh
conda create -n omniglue pip
conda activate omniglue
git clone https://github.com/google-research/omniglue.git
cd omniglue
pip install -e .
```
Then, download the following models to `./models/`
```sh
# Download to ./models/ dir.
mkdir models
cd models
# SuperPoint.
git clone https://github.com/rpautrat/SuperPoint.git
mv SuperPoint/pretrained_models/sp_v6.tgz . && rm -rf SuperPoint
tar zxvf sp_v6.tgz && rm sp_v6.tgz
# DINOv2 - vit-b14.
wget https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth
# OmniGlue.
wget https://storage.googleapis.com/omniglue/og_export.zip
unzip og_export.zip && rm og_export.zip
```
Direct download links:
- [[SuperPoint weights]](https://github.com/rpautrat/SuperPoint/tree/master/pretrained_models): from [github.com/rpautrat/SuperPoint](https://github.com/rpautrat/SuperPoint)
- [[DINOv2 weights]](https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth): from [github.com/facebookresearch/dinov2](https://github.com/facebookresearch/dinov2) (ViT-B/14 distilled backbone without register).
- [[OmniGlue weights]](https://storage.googleapis.com/omniglue/og_export.zip)
## Usage
The code snippet below outlines how you can perform OmniGlue inference in your
own python codebase.
```py
from src import omniglue
image0 = ... # load images from file into np.array
image1 = ...
og = omniglue.OmniGlue(
og_export="./models/omniglue.onnx",
sp_export="./models/sp_v6.onnx",
dino_export="./models/dinov2_vitb14_pretrain.pth",
)
match_kp0s, match_kp1s, match_confidences = og.FindMatches(image0, image1)
# Output:
# match_kp0: (N, 2) array of (x,y) coordinates in image0.
# match_kp1: (N, 2) array of (x,y) coordinates in image1.
# match_confidences: N-dim array of each of the N match confidence scores.
```
## Demo
`demo.py` contains example usage of the `omniglue` module. To try with your own
images, replace `./res/demo1.jpg` and `./res/demo2.jpg` with your own
filepaths.
```sh
conda activate omniglue
python demo.py ./res/demo1.jpg ./res/demo2.jpg
# <see output in './demo_output.png'>
```
Expected output:
![demo_output.png](res/demo_output.png "demo_output.png")
Comparison of Results Between TensorFlow and ONNX:
![result_tf_and_onnx.png](res/result_tf_and_onnx.png "result_tf_and_onnx.png")
## Repo TODOs
- ~~Provide `demo.py` example usage script.~~
- Support matching for pre-extracted features.
- Release eval pipelines for in-domain (MegaDepth).
- Release eval pipelines for all out-of-domain datasets.
## BibTex
```
@inproceedings{jiang2024Omniglue,
title={OmniGlue: Generalizable Feature Matching with Foundation Model Guidance},
author={Jiang, Hanwen and Karpur, Arjun and Cao, Bingyi and Huang, Qixing and Araujo, Andre},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024},
}
```
--------------------------------------------------------------------------------
This is not an officially supported Google product.
|