重新思考在3D空間中對輝度場進行開放詞彙分割
Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space
August 14, 2024
作者: Hyunjee Lee, Youngsik Yun, Jeongmin Bae, Seoha Kim, Youngjung Uh
cs.AI
摘要
理解場景的三維語義是各種情境的基本問題,例如具身代理。儘管 NeRFs 和 3DGS 在新視角合成方面表現出色,但先前用於理解其語義的方法僅限於不完整的三維理解:它們的分割結果是二維遮罩,監督則鎖定在二維像素上。本文重新審視問題集,以追求對由 NeRFs 和 3DGS 建模的場景進行更好的三維理解,具體如下:1)我們直接監督三維點以訓練語言嵌入字段。它實現了最先進的準確性,而無需依賴多尺度語言嵌入。2)我們將預先訓練的語言字段轉移到 3DGS,實現了首個實時渲染速度,而不會犧牲訓練時間或準確性。3)我們引入了一個用於評估重建幾何和語義的三維查詢和評估協議。代碼、檢查點和註釋將在線上提供。項目頁面:https://hyunji12.github.io/Open3DRF
English
Understanding the 3D semantics of a scene is a fundamental problem for
various scenarios such as embodied agents. While NeRFs and 3DGS excel at
novel-view synthesis, previous methods for understanding their semantics have
been limited to incomplete 3D understanding: their segmentation results are 2D
masks and their supervision is anchored at 2D pixels. This paper revisits the
problem set to pursue a better 3D understanding of a scene modeled by NeRFs and
3DGS as follows. 1) We directly supervise the 3D points to train the language
embedding field. It achieves state-of-the-art accuracy without relying on
multi-scale language embeddings. 2) We transfer the pre-trained language field
to 3DGS, achieving the first real-time rendering speed without sacrificing
training time or accuracy. 3) We introduce a 3D querying and evaluation
protocol for assessing the reconstructed geometry and semantics together. Code,
checkpoints, and annotations will be available online. Project page:
https://hyunji12.github.io/Open3DRFSummary
AI-Generated Summary