ViSTA-SLAM：基於對稱雙視圖關聯的視覺SLAM

摘要

我們提出ViSTA-SLAM作為一種即時單目視覺SLAM系統，其運作無需相機內參數，從而使其廣泛適用於多種相機配置。該系統的核心採用了一種輕量級的對稱雙視圖關聯（STA）模型作為前端，該模型僅需兩張RGB圖像即可同時估計相機相對位姿並回歸局部點雲圖。此設計大幅降低了模型複雜度，我們前端的規模僅為同類最先進方法的35%，同時提升了流程中所用雙視圖約束的質量。在後端，我們構建了一個特別設計的Sim(3)位姿圖，該圖融合了迴環檢測以應對累積漂移問題。大量實驗表明，與現有方法相比，我們的方法在相機追蹤和密集三維重建質量方面均展現出卓越性能。GitHub倉庫地址：https://github.com/zhangganlin/vista-slam。

English

We present ViSTA-SLAM as a real-time monocular visual SLAM system that operates without requiring camera intrinsics, making it broadly applicable across diverse camera setups. At its core, the system employs a lightweight symmetric two-view association (STA) model as the frontend, which simultaneously estimates relative camera poses and regresses local pointmaps from only two RGB images. This design reduces model complexity significantly, the size of our frontend is only 35\% that of comparable state-of-the-art methods, while enhancing the quality of two-view constraints used in the pipeline. In the backend, we construct a specially designed Sim(3) pose graph that incorporates loop closures to address accumulated drift. Extensive experiments demonstrate that our approach achieves superior performance in both camera tracking and dense 3D reconstruction quality compared to current methods. Github repository: https://github.com/zhangganlin/vista-slam

ViSTA-SLAM：基於對稱雙視圖關聯的視覺SLAM

ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association

摘要

Support