Spaces:
Running
on
Zero
XoFTR: Cross-modal Feature Matching Transformer
Paper (arXiv) | Paper (CVF)
This is Pytorch implementation of XoFTR: Cross-modal Feature Matching Transformer CVPR 2024 Image Matching Workshop paper.
XoFTR is a cross-modal cross-view method for local feature matching between thermal infrared (TIR) and visible images.
Colab demo
To run XoFTR with custom image pairs without configuring your own GPU environment, you can use the Colab demo:
Installation
conda env create -f environment.yaml
conda activate xoftr
Download links for
- Pretrained models weights: Two versions available, trained at 640 and 840 resolutions.
- METU-VisTIR dataset
METU-VisTIR Dataset
This dataset includes thermal and visible images captured across six diverse scenes with ground-truth camera poses. Four of the scenes encompass images captured under both cloudy and sunny conditions, while the remaining two scenes exclusively feature cloudy conditions. Since the cameras are auto-focus, there may be result in slight imperfections in the ground truth camera parameters. For more information about the dataset, please refer to our paper.
License of the dataset:
The METU-VisTIR dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
Data format
The dataset is organized into folders according to scenarios. The organization format is as follows:
METU-VisTIR/
βββ index/
β βββ scene_info_test/
β β βββ cloudy_cloudy_scene_1.npz # scene info with test pairs
β β βββ ...
β βββ scene_info_val/
β β βββ cloudy_cloudy_scene_1.npz # scene info with val pairs
β β βββ ...
β βββ val_test_list/
β βββ test_list.txt # test scenes list
β βββ val_list.txt # val scenes list
βββ cloudy/ # cloudy scenes
β βββ scene_1/
β β βββ thermal/
β β β βββ images/ # thermal images
β β βββ visible/
β β βββ images/ # visible images
β βββ ...
βββ sunny/ # sunny scenes
βββ ...
cloudy_cloudy_scene_*.npz and cloudy_sunny_scene_*.npz files contain GT camera poses and image pairs
Runing XoFTR
Demo to match image pairs with XoFTR
A demo notebook for XoFTR on a single pair of images is given in notebooks/xoftr_demo.ipynb.
Reproduce the testing results for relative pose estimation
You need to download METU-VisTIR dataset. After downloading, unzip the required files. Then, symlinks need to be created for the data
folder.
unzip downloaded-file.zip
# set up symlinks
ln -s /path/to/METU_VisTIR/ /path/to/XoFTR/data/
conda activate xoftr
python test_relative_pose.py xoftr --ckpt weights/weights_xoftr_640.ckpt
# with visualization
python test_relative_pose.py xoftr --ckpt weights/weights_xoftr_640.ckpt --save_figs
The results and figures are saved to results_relative_pose/
.
Training
See Training XoFTR for more details.
Citation
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{tuzcuouglu2024xoftr,
title={XoFTR: Cross-modal Feature Matching Transformer},
author={Tuzcuo{\u{g}}lu, {\"O}nder and K{\"o}ksal, Aybora and Sofu, Bu{\u{g}}ra and Kalkan, Sinan and Alatan, A Aydin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4275--4286},
year={2024}
}
Acknowledgement
This code is derived from LoFTR. We are grateful to the authors for their contribution of the source code.