InstantSplat:40秒內的無界稀疏視角無姿勢高斯Splatting
InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds
March 29, 2024
作者: Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang
cs.AI
摘要
儘管新穎視角合成(NVS)在3D電腦視覺領域取得了顯著進展,但通常需要從密集視點對相機內部參數和外部參數進行初始估計。這種預處理通常通過結構從運動(SfM)流程進行,這種程序可能會很慢且不可靠,特別是在稀疏視角情況下,由於匹配特徵不足以進行準確重建。在這項工作中,我們將基於點的表示法(例如3D高斯飛灑,3D-GS)的優勢與端到端的密集立體模型(DUSt3R)相結合,以應對NVS在無限制環境下的複雜但未解決的問題,該環境包括無姿態和稀疏視角挑戰。我們的框架InstantSplat將密集立體先驗與3D-GS結合,以在不到1分鐘內從稀疏視角和無姿態圖像中構建大規模場景的3D高斯。具體而言,InstantSplat包括一個快速建立初始場景結構和所有訓練視角的相機參數的粗略幾何初始化(CGI)模塊,利用從預先訓練的密集立體流程中獲得的全局對齊的3D點地圖。然後是快速3D高斯優化(F-3DGO)模塊,該模塊聯合優化3D高斯屬性和初始化姿態,並進行姿態正則化。在大規模室外Tanks&Temples數據集上進行的實驗表明,InstantSplat顯著提高了SSIM(32%),同時將絕對軌跡誤差(ATE)降低了80%。這些結果確立了InstantSplat作為處理無姿態和稀疏視角情況的可行解決方案。項目頁面:instantsplat.github.io。
English
While novel view synthesis (NVS) has made substantial progress in 3D computer
vision, it typically requires an initial estimation of camera intrinsics and
extrinsics from dense viewpoints. This pre-processing is usually conducted via
a Structure-from-Motion (SfM) pipeline, a procedure that can be slow and
unreliable, particularly in sparse-view scenarios with insufficient matched
features for accurate reconstruction. In this work, we integrate the strengths
of point-based representations (e.g., 3D Gaussian Splatting, 3D-GS) with
end-to-end dense stereo models (DUSt3R) to tackle the complex yet unresolved
issues in NVS under unconstrained settings, which encompasses pose-free and
sparse view challenges. Our framework, InstantSplat, unifies dense stereo
priors with 3D-GS to build 3D Gaussians of large-scale scenes from sparseview &
pose-free images in less than 1 minute. Specifically, InstantSplat comprises a
Coarse Geometric Initialization (CGI) module that swiftly establishes a
preliminary scene structure and camera parameters across all training views,
utilizing globally-aligned 3D point maps derived from a pre-trained dense
stereo pipeline. This is followed by the Fast 3D-Gaussian Optimization (F-3DGO)
module, which jointly optimizes the 3D Gaussian attributes and the initialized
poses with pose regularization. Experiments conducted on the large-scale
outdoor Tanks & Temples datasets demonstrate that InstantSplat significantly
improves SSIM (by 32%) while concurrently reducing Absolute Trajectory Error
(ATE) by 80%. These establish InstantSplat as a viable solution for scenarios
involving posefree and sparse-view conditions. Project page:
instantsplat.github.io.Summary
AI-Generated Summary