圖像重建:超快速單視角3D重建
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
December 20, 2023
作者: Stanislaw Szymanowicz, Christian Rupprecht, Andrea Vedaldi
cs.AI
摘要
我們介紹了Splatter Image,這是一種超快速的單眼3D物體重建方法,操作速度達到每秒38幀。Splatter Image基於高斯飛濺技術,該技術最近為多視角重建帶來了實時渲染、快速訓練和優秀的擴展性。我們首次將高斯飛濺技術應用於單眼重建環境中。我們的方法基於學習,測試時僅需要對神經網絡進行前向評估即可進行重建。Splatter Image的主要創新在於其驚人簡單的設計:它使用2D圖像對圖像網絡,將輸入圖像映射到每個像素的一個3D高斯分布。因此產生的高斯分布形成了一個圖像,即Splatter Image。我們進一步擴展了該方法,通過添加跨視圖關注,將多於一個圖像納入輸入。由於渲染器的速度(每秒588幀),我們可以在訓練時僅使用單個GPU,同時在每次迭代中生成整個圖像,以優化像LPIPS等感知指標。在標準基準測試中,我們不僅展示了快速重建,還在PSNR、LPIPS和其他指標方面取得了比最近且成本更高的基準線更好的結果。
English
We introduce the Splatter Image, an ultra-fast approach for monocular 3D
object reconstruction which operates at 38 FPS. Splatter Image is based on
Gaussian Splatting, which has recently brought real-time rendering, fast
training, and excellent scaling to multi-view reconstruction. For the first
time, we apply Gaussian Splatting in a monocular reconstruction setting. Our
approach is learning-based, and, at test time, reconstruction only requires the
feed-forward evaluation of a neural network. The main innovation of Splatter
Image is the surprisingly straightforward design: it uses a 2D image-to-image
network to map the input image to one 3D Gaussian per pixel. The resulting
Gaussians thus have the form of an image, the Splatter Image. We further extend
the method to incorporate more than one image as input, which we do by adding
cross-view attention. Owning to the speed of the renderer (588 FPS), we can use
a single GPU for training while generating entire images at each iteration in
order to optimize perceptual metrics like LPIPS. On standard benchmarks, we
demonstrate not only fast reconstruction but also better results than recent
and much more expensive baselines in terms of PSNR, LPIPS, and other metrics.