InstantSplat:40秒内实现无界稀疏视角无姿态高斯喷洒
InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds
March 29, 2024
作者: Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang
cs.AI
摘要
尽管新颖视图合成(NVS)在三维计算机视觉领域取得了显著进展,但它通常需要从密集视点对相机内参和外参进行初始估计。这一预处理过程通常通过运动恢复结构(SfM)管道进行,该过程可能缓慢且不可靠,尤其是在稀疏视图场景中,由于匹配特征不足,难以实现精确重建。本文中,我们将基于点的表示(如三维高斯喷洒,3D-GS)与端到端密集立体模型(DUSt3R)的优势相结合,以解决在无约束设置下NVS中复杂且未解决的问题,包括无姿态和稀疏视图的挑战。我们的框架InstantSplat,将密集立体先验与3D-GS统一,能够在不到一分钟的时间内从稀疏视图和无姿态的图像中构建大规模场景的三维高斯分布。具体而言,InstantSplat包含一个粗略几何初始化(CGI)模块,该模块利用从预训练密集立体管道中获得的全局对齐三维点图,快速建立所有训练视图的初步场景结构和相机参数。随后是快速三维高斯优化(F-3DGO)模块,它联合优化三维高斯属性和初始姿态,并进行姿态正则化。在大规模户外Tanks & Temples数据集上的实验表明,InstantSplat显著提升了SSIM(提高32%),同时将绝对轨迹误差(ATE)降低了80%。这些结果确立了InstantSplat在无姿态和稀疏视图条件下的可行解决方案地位。项目页面:instantsplat.github.io。
English
While novel view synthesis (NVS) has made substantial progress in 3D computer
vision, it typically requires an initial estimation of camera intrinsics and
extrinsics from dense viewpoints. This pre-processing is usually conducted via
a Structure-from-Motion (SfM) pipeline, a procedure that can be slow and
unreliable, particularly in sparse-view scenarios with insufficient matched
features for accurate reconstruction. In this work, we integrate the strengths
of point-based representations (e.g., 3D Gaussian Splatting, 3D-GS) with
end-to-end dense stereo models (DUSt3R) to tackle the complex yet unresolved
issues in NVS under unconstrained settings, which encompasses pose-free and
sparse view challenges. Our framework, InstantSplat, unifies dense stereo
priors with 3D-GS to build 3D Gaussians of large-scale scenes from sparseview &
pose-free images in less than 1 minute. Specifically, InstantSplat comprises a
Coarse Geometric Initialization (CGI) module that swiftly establishes a
preliminary scene structure and camera parameters across all training views,
utilizing globally-aligned 3D point maps derived from a pre-trained dense
stereo pipeline. This is followed by the Fast 3D-Gaussian Optimization (F-3DGO)
module, which jointly optimizes the 3D Gaussian attributes and the initialized
poses with pose regularization. Experiments conducted on the large-scale
outdoor Tanks & Temples datasets demonstrate that InstantSplat significantly
improves SSIM (by 32%) while concurrently reducing Absolute Trajectory Error
(ATE) by 80%. These establish InstantSplat as a viable solution for scenarios
involving posefree and sparse-view conditions. Project page:
instantsplat.github.io.Summary
AI-Generated Summary