MegaLoc: 모든 장소를 하나의 검색으로 찾아내다

초록

주어진 쿼리와 동일한 위치에서 이미지를 검색하는 것은 시각적 장소 인식(Visual Place Recognition), 랜드마크 검색(Landmark Retrieval), 시각적 위치 추정(Visual Localization), 3D 재구성, 그리고 SLAM과 같은 다양한 컴퓨터 비전 작업에서 중요한 요소입니다. 그러나 기존의 솔루션들은 이러한 작업 중 하나에 특화되어 설계되었으며, 요구사항이 약간 변경되거나 분포 외 데이터(out-of-distribution data)를 만났을 때 실패하는 것으로 알려져 있습니다. 본 논문에서는 다양한 기존 방법, 훈련 기법, 그리고 데이터셋을 결합하여 여러 작업에서 우수한 성능을 보이는 검색 모델인 MegaLoc을 훈련시켰습니다. 우리는 MegaLoc이 (1) 다수의 시각적 장소 인식 데이터셋에서 최첨단 성능을 달성하고, (2) 일반적인 랜드마크 검색 데이터셋에서 인상적인 결과를 보이며, (3) LaMAR 데이터셋에서 기존의 위치 추정 파이프라인의 검색 방법만 변경하여 시각적 위치 추정 분야에서 새로운 최첨단 성능을 설정한다는 것을 발견했습니다. MegaLoc의 코드는 https://github.com/gmberton/MegaLoc에서 확인할 수 있습니다.

English

Retrieving images from the same location as a given query is an important component of multiple computer vision tasks, like Visual Place Recognition, Landmark Retrieval, Visual Localization, 3D reconstruction, and SLAM. However, existing solutions are built to specifically work for one of these tasks, and are known to fail when the requirements slightly change or when they meet out-of-distribution data. In this paper we combine a variety of existing methods, training techniques, and datasets to train a retrieval model, called MegaLoc, that is performant on multiple tasks. We find that MegaLoc (1) achieves state of the art on a large number of Visual Place Recognition datasets, (2) impressive results on common Landmark Retrieval datasets, and (3) sets a new state of the art for Visual Localization on the LaMAR datasets, where we only changed the retrieval method to the existing localization pipeline. The code for MegaLoc is available at https://github.com/gmberton/MegaLoc

MegaLoc: 모든 장소를 하나의 검색으로 찾아내다

MegaLoc: One Retrieval to Place Them All

초록

Support