ARTDECO：基於結構化場景表示的高效高保真即時3D重建

摘要

從單目圖像序列進行即時3D重建是計算機視覺領域長期以來的挑戰，對於實境到虛擬（real-to-sim）、增強現實/虛擬現實（AR/VR）以及機器人等應用至關重要。現有方法面臨一個主要權衡：針對單一場景的優化能帶來高保真度，但計算成本高昂；而前饋式基礎模型雖能實現實時推理，卻在準確性和魯棒性上表現欠佳。在本研究中，我們提出了ARTDECO，一個統一框架，它結合了前饋模型的高效性與基於SLAM管道的可靠性。ARTDECO利用3D基礎模型進行姿態估計和點雲預測，並配備一個高斯解碼器，將多尺度特徵轉化為結構化的3D高斯分佈。為了在保持大規模場景下保真度和效率的平衡，我們設計了一種分層高斯表示法，結合細節層次（LoD）感知的渲染策略，從而提升渲染保真度的同時減少冗餘。在八個多樣化的室內外基準測試中，ARTDECO展現了與SLAM相當的交互性能、接近前饋系統的魯棒性，以及逼近單場景優化的重建質量，為實現兼具精確幾何與高視覺保真度的現實世界環境即時數字化提供了一條實用路徑。更多演示請訪問我們的項目頁面：https://city-super.github.io/artdeco/。

English

On-the-fly 3D reconstruction from monocular image sequences is a long-standing challenge in computer vision, critical for applications such as real-to-sim, AR/VR, and robotics. Existing methods face a major tradeoff: per-scene optimization yields high fidelity but is computationally expensive, whereas feed-forward foundation models enable real-time inference but struggle with accuracy and robustness. In this work, we propose ARTDECO, a unified framework that combines the efficiency of feed-forward models with the reliability of SLAM-based pipelines. ARTDECO uses 3D foundation models for pose estimation and point prediction, coupled with a Gaussian decoder that transforms multi-scale features into structured 3D Gaussians. To sustain both fidelity and efficiency at scale, we design a hierarchical Gaussian representation with a LoD-aware rendering strategy, which improves rendering fidelity while reducing redundancy. Experiments on eight diverse indoor and outdoor benchmarks show that ARTDECO delivers interactive performance comparable to SLAM, robustness similar to feed-forward systems, and reconstruction quality close to per-scene optimization, providing a practical path toward on-the-fly digitization of real-world environments with both accurate geometry and high visual fidelity. Explore more demos on our project page: https://city-super.github.io/artdeco/.

ARTDECO：基於結構化場景表示的高效高保真即時3D重建

ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

摘要

Support