ChatPaper.aiChatPaper

具有空間記憶的3D重建

3D Reconstruction with Spatial Memory

August 28, 2024
作者: Hengyi Wang, Lourdes Agapito
cs.AI

摘要

我們提出了Spann3R,一種從有序或無序圖像集合中進行密集3D重建的新方法。基於DUSt3R範式,Spann3R使用基於Transformer的架構,直接從圖像中回歸出點地圖,而無需任何有關場景或相機參數的先前知識。與DUSt3R不同,後者預測每對圖像的點地圖,每個點地圖都在其本地坐標系中表示,Spann3R可以預測以全局坐標系表示的每個圖像的點地圖,從而消除了基於優化的全局對齊的需求。Spann3R的關鍵思想是管理一個外部空間記憶體,該記憶體學習跟踪所有先前的相關3D信息。然後,Spann3R查詢這個空間記憶體,以在全局坐標系中預測下一幀的3D結構。利用DUSt3R的預訓練權重,並在數據集的子集上進一步微調,Spann3R在各種未見數據集上展現出競爭性的性能和泛化能力,並能夠實時處理有序圖像集合。項目頁面:https://hengyiwang.github.io/projects/spanner
English
We present Spann3R, a novel approach for dense 3D reconstruction from ordered or unordered image collections. Built on the DUSt3R paradigm, Spann3R uses a transformer-based architecture to directly regress pointmaps from images without any prior knowledge of the scene or camera parameters. Unlike DUSt3R, which predicts per image-pair pointmaps each expressed in its local coordinate frame, Spann3R can predict per-image pointmaps expressed in a global coordinate system, thus eliminating the need for optimization-based global alignment. The key idea of Spann3R is to manage an external spatial memory that learns to keep track of all previous relevant 3D information. Spann3R then queries this spatial memory to predict the 3D structure of the next frame in a global coordinate system. Taking advantage of DUSt3R's pre-trained weights, and further fine-tuning on a subset of datasets, Spann3R shows competitive performance and generalization ability on various unseen datasets and can process ordered image collections in real time. Project page: https://hengyiwang.github.io/projects/spanner
PDF152November 14, 2024