ChatPaper.aiChatPaper

可微分積木世界:透過渲染原始物件進行定性三維分解

Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives

July 11, 2023
作者: Tom Monnier, Jake Austin, Angjoo Kanazawa, Alexei A. Efros, Mathieu Aubry
cs.AI

摘要

在一組經校準的場景影像中,我們提出了一種方法,通過使用三維基本元素,生成一個簡單、緊湊且可操作的三維世界表示。儘管許多方法專注於恢復高保真度的三維場景,我們專注於將場景解析為由少量紋理基本元素組成的中層三維表示。這些表示易於解釋、易於操作並適用於基於物理的模擬。此外,與現有依賴三維輸入數據的基本元素分解方法不同,我們的方法通過可微渲染直接在影像上運作。具體來說,我們將基本元素建模為紋理超四面體網格,並通過影像渲染損失從頭開始優化它們的參數。我們強調為每個基本元素建模透明度的重要性,這對於優化至關重要,同時也能處理不同數量的基本元素。我們展示了生成的紋理基本元素忠實重建了輸入影像並準確建模了可見的三維點,同時提供了看不見物體區域的全模形狀完成。我們將我們的方法與來自DTU的各種場景的最新技術進行了比較,並展示了它在BlendedMVS和Nerfstudio的現實拍攝中的穩健性。我們還展示了我們的結果如何用於輕鬆編輯場景或執行物理模擬。代碼和視頻結果可在https://www.tmonnier.com/DBW 上找到。
English
Given a set of calibrated images of a scene, we present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives. While many approaches focus on recovering high-fidelity 3D scenes, we focus on parsing a scene into mid-level 3D representations made of a small set of textured primitives. Such representations are interpretable, easy to manipulate and suited for physics-based simulations. Moreover, unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images through differentiable rendering. Specifically, we model primitives as textured superquadric meshes and optimize their parameters from scratch with an image rendering loss. We highlight the importance of modeling transparency for each primitive, which is critical for optimization and also enables handling varying numbers of primitives. We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points, while providing amodal shape completions of unseen object regions. We compare our approach to the state of the art on diverse scenes from DTU, and demonstrate its robustness on real-life captures from BlendedMVS and Nerfstudio. We also showcase how our results can be used to effortlessly edit a scene or perform physical simulations. Code and video results are available at https://www.tmonnier.com/DBW .
PDF130December 15, 2024