ChatPaper.aiChatPaper

在三维空间中重新思考辐射场的开放词汇分割

Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space

August 14, 2024
作者: Hyunjee Lee, Youngsik Yun, Jeongmin Bae, Seoha Kim, Youngjung Uh
cs.AI

摘要

理解场景的三维语义是各种场景的基本问题,比如具身代理。虽然 NeRF 和 3DGS 擅长新视角合成,但先前用于理解它们语义的方法局限于不完整的三维理解:它们的分割结果是二维掩模,监督则锚定在二维像素上。本文重新审视这一问题集,以追求对由 NeRF 和 3DGS 建模的场景的更好三维理解。1)我们直接监督三维点以训练语言嵌入场。它在不依赖多尺度语言嵌入的情况下实现了最先进的准确性。2)我们将预训练的语言场转移到 3DGS,实现了首个实时渲染速度,而不牺牲训练时间或准确性。3)我们引入了一个用于评估重建几何和语义的三维查询和评估协议。代码、检查点和注释将在网上提供。项目页面:https://hyunji12.github.io/Open3DRF
English
Understanding the 3D semantics of a scene is a fundamental problem for various scenarios such as embodied agents. While NeRFs and 3DGS excel at novel-view synthesis, previous methods for understanding their semantics have been limited to incomplete 3D understanding: their segmentation results are 2D masks and their supervision is anchored at 2D pixels. This paper revisits the problem set to pursue a better 3D understanding of a scene modeled by NeRFs and 3DGS as follows. 1) We directly supervise the 3D points to train the language embedding field. It achieves state-of-the-art accuracy without relying on multi-scale language embeddings. 2) We transfer the pre-trained language field to 3DGS, achieving the first real-time rendering speed without sacrificing training time or accuracy. 3) We introduce a 3D querying and evaluation protocol for assessing the reconstructed geometry and semantics together. Code, checkpoints, and annotations will be available online. Project page: https://hyunji12.github.io/Open3DRF

Summary

AI-Generated Summary

PDF72November 28, 2024