NeRF-Det：学习几何感知体积表示以实现多视角3D物体检测

摘要

我们提出了NeRF-Det，这是一种新颖的室内三维检测方法，以姿态RGB图像作为输入。与现有的室内三维检测方法不同，这些方法难以对场景几何进行建模，我们的方法巧妙地利用NeRF来明确估计三维几何，从而提高了三维检测性能。具体而言，为了避免与NeRF的每个场景优化相关的显着额外延迟，我们引入了足够的几何先验知识，以增强NeRF-MLP的泛化能力。此外，我们通过共享MLP微妙地连接检测和NeRF分支，实现了NeRF对检测的高效适应，并为三维检测提供了几何感知的体积表示。我们的方法在ScanNet和ARKITScenes基准测试中分别比现有技术高出3.9 mAP和3.1 mAP。我们进行了广泛的分析，以阐明NeRF-Det的工作原理。由于我们的联合训练设计，NeRF-Det能够很好地推广到未见场景，用于对象检测、视图合成和深度估计任务，而无需每个场景的优化。代码可在https://github.com/facebookresearch/NeRF-Det找到。

English

We present NeRF-Det, a novel method for indoor 3D detection with posed RGB images as input. Unlike existing indoor 3D detection methods that struggle to model scene geometry, our method makes novel use of NeRF in an end-to-end manner to explicitly estimate 3D geometry, thereby improving 3D detection performance. Specifically, to avoid the significant extra latency associated with per-scene optimization of NeRF, we introduce sufficient geometry priors to enhance the generalizability of NeRF-MLP. Furthermore, we subtly connect the detection and NeRF branches through a shared MLP, enabling an efficient adaptation of NeRF to detection and yielding geometry-aware volumetric representations for 3D detection. Our method outperforms state-of-the-arts by 3.9 mAP and 3.1 mAP on the ScanNet and ARKITScenes benchmarks, respectively. We provide extensive analysis to shed light on how NeRF-Det works. As a result of our joint-training design, NeRF-Det is able to generalize well to unseen scenes for object detection, view synthesis, and depth estimation tasks without requiring per-scene optimization. Code is available at https://github.com/facebookresearch/NeRF-Det.

NeRF-Det：学习几何感知体积表示以实现多视角3D物体检测

NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

摘要

Support