Fast3R:實現在一次前向傳遞中對1000多張圖像進行3D重建
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
January 23, 2025
作者: Jianing Yang, Alexander Sax, Kevin J. Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier, Matt Feiszli
cs.AI
摘要
在計算機視覺中,多視角3D重建仍然是一個核心挑戰,特別是在需要準確且可擴展地呈現各種視角的應用中。目前領先的方法如DUSt3R採用基本上是成對處理的方法,對影像進行成對處理,需要昂貴的全局對齊程序才能從多個視角進行重建。在這項工作中,我們提出了快速3D重建(Fast3R),這是對DUSt3R的一種新型多視角泛化方法,通過並行處理多個視角實現高效且可擴展的3D重建。Fast3R的基於Transformer的架構可以在單個前向通過中轉發N張圖像,無需迭代對齊。通過對相機姿態估計和3D重建的大量實驗,Fast3R展示了最先進的性能,推理速度顯著提高,錯誤積累減少。這些結果確立了Fast3R作為多視角應用的一個堅固選擇,提供了增強的可擴展性,同時不會影響重建的準確性。
English
Multi-view 3D reconstruction remains a core challenge in computer vision,
particularly in applications requiring accurate and scalable representations
across diverse perspectives. Current leading methods such as DUSt3R employ a
fundamentally pairwise approach, processing images in pairs and necessitating
costly global alignment procedures to reconstruct from multiple views. In this
work, we propose Fast 3D Reconstruction (Fast3R), a novel multi-view
generalization to DUSt3R that achieves efficient and scalable 3D reconstruction
by processing many views in parallel. Fast3R's Transformer-based architecture
forwards N images in a single forward pass, bypassing the need for iterative
alignment. Through extensive experiments on camera pose estimation and 3D
reconstruction, Fast3R demonstrates state-of-the-art performance, with
significant improvements in inference speed and reduced error accumulation.
These results establish Fast3R as a robust alternative for multi-view
applications, offering enhanced scalability without compromising reconstruction
accuracy.Summary
AI-Generated Summary