影子人:学习消除相似结构图像的歧义
Doppelgangers: Learning to Disambiguate Images of Similar Structures
September 5, 2023
作者: Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely
cs.AI
摘要
我们考虑了视觉消歧任务,即确定一对视觉上相似的图像是否描绘了相同或不同的3D表面(例如,对称建筑的同侧或异侧)。虚假图像匹配指的是两幅图像观察到不同但在视觉上相似的3D表面,这对人类来说可能很难区分,也会导致3D重建算法产生错误结果。我们提出了一种基于学习的视觉消歧方法,将其构建为对图像对的二元分类任务。为此,我们引入了一个针对这一问题的新数据集“Doppelgangers”,其中包括具有地面真实标签的相似结构图像对。我们还设计了一个网络架构,以局部关键点和匹配的空间分布作为输入,从而更好地推理局部和全局线索。我们的评估表明,我们的方法可以在困难情况下区分虚假匹配,并可集成到SfM流程中,以生成正确的、消歧的3D重建结果。请查看我们的项目页面获取代码、数据集和更多结果:http://doppelgangers-3d.github.io/。
English
We consider the visual disambiguation task of determining whether a pair of
visually similar images depict the same or distinct 3D surfaces (e.g., the same
or opposite sides of a symmetric building). Illusory image matches, where two
images observe distinct but visually similar 3D surfaces, can be challenging
for humans to differentiate, and can also lead 3D reconstruction algorithms to
produce erroneous results. We propose a learning-based approach to visual
disambiguation, formulating it as a binary classification task on image pairs.
To that end, we introduce a new dataset for this problem, Doppelgangers, which
includes image pairs of similar structures with ground truth labels. We also
design a network architecture that takes the spatial distribution of local
keypoints and matches as input, allowing for better reasoning about both local
and global cues. Our evaluation shows that our method can distinguish illusory
matches in difficult cases, and can be integrated into SfM pipelines to produce
correct, disambiguated 3D reconstructions. See our project page for our code,
datasets, and more results: http://doppelgangers-3d.github.io/.