ChatPaper.aiChatPaper

基于空间记忆的3维重建

3D Reconstruction with Spatial Memory

August 28, 2024
作者: Hengyi Wang, Lourdes Agapito
cs.AI

摘要

我们提出Spann3R——一种从有序或无序图像集合中进行稠密三维重建的新方法。基于DUSt3R范式,Spann3R采用基于Transformer的架构,无需任何场景先验知识或相机参数即可直接从图像回归点云图。与DUSt3R预测每对图像在局部坐标系下的点云图不同,Spann3R能够预测全局坐标系下的单图像点云图,从而消除了基于优化的全局对齐需求。Spann3R的核心思想是维护一个外部空间记忆模块,该模块通过学习持续追踪所有先前的相关三维信息。随后Spann3R通过查询该空间记忆,在全局坐标系下预测下一帧的三维结构。该方法充分利用DUSt3R的预训练权重,并在部分数据集上进一步微调,在多种未见数据集上展现出卓越的性能和泛化能力,可实时处理有序图像序列。项目页面:https://hengyiwang.github.io/projects/spanner
English
We present Spann3R, a novel approach for dense 3D reconstruction from ordered or unordered image collections. Built on the DUSt3R paradigm, Spann3R uses a transformer-based architecture to directly regress pointmaps from images without any prior knowledge of the scene or camera parameters. Unlike DUSt3R, which predicts per image-pair pointmaps each expressed in its local coordinate frame, Spann3R can predict per-image pointmaps expressed in a global coordinate system, thus eliminating the need for optimization-based global alignment. The key idea of Spann3R is to manage an external spatial memory that learns to keep track of all previous relevant 3D information. Spann3R then queries this spatial memory to predict the 3D structure of the next frame in a global coordinate system. Taking advantage of DUSt3R's pre-trained weights, and further fine-tuning on a subset of datasets, Spann3R shows competitive performance and generalization ability on various unseen datasets and can process ordered image collections in real time. Project page: https://hengyiwang.github.io/projects/spanner
PDF152November 14, 2024