NeRF-Det：學習幾何感知體積表示以進行多視角3D物體檢測

摘要

我們提出了一種名為 NeRF-Det 的新方法，用於室內三維檢測，其以姿勢 RGB 圖像作為輸入。與現有的室內三維檢測方法不同，這些方法難以建模場景幾何，我們的方法巧妙地利用 NeRF 以端到端的方式明確估計三維幾何，從而提高了三維檢測性能。具體來說，為了避免與 NeRF 的每個場景優化相關的額外延遲，我們引入了足夠的幾何先驗知識，以增強 NeRF-MLP 的泛化能力。此外，我們通過共享 MLP 細緻地連接檢測和 NeRF 分支，實現了 NeRF 對檢測的高效適應，並為三維檢測提供了具有幾何意識的體積表示。我們的方法在 ScanNet 和 ARKITScenes 基準測試中分別比現有技術高出 3.9 mAP 和 3.1 mAP。我們提供了詳盡的分析，以闡明 NeRF-Det 的工作原理。由於我們的聯合訓練設計，NeRF-Det 能夠很好地泛化到未見過的場景，用於物體檢測、視圖合成和深度估計任務，而無需每個場景進行優化。代碼可在 https://github.com/facebookresearch/NeRF-Det 找到。

English

We present NeRF-Det, a novel method for indoor 3D detection with posed RGB images as input. Unlike existing indoor 3D detection methods that struggle to model scene geometry, our method makes novel use of NeRF in an end-to-end manner to explicitly estimate 3D geometry, thereby improving 3D detection performance. Specifically, to avoid the significant extra latency associated with per-scene optimization of NeRF, we introduce sufficient geometry priors to enhance the generalizability of NeRF-MLP. Furthermore, we subtly connect the detection and NeRF branches through a shared MLP, enabling an efficient adaptation of NeRF to detection and yielding geometry-aware volumetric representations for 3D detection. Our method outperforms state-of-the-arts by 3.9 mAP and 3.1 mAP on the ScanNet and ARKITScenes benchmarks, respectively. We provide extensive analysis to shed light on how NeRF-Det works. As a result of our joint-training design, NeRF-Det is able to generalize well to unseen scenes for object detection, view synthesis, and depth estimation tasks without requiring per-scene optimization. Code is available at https://github.com/facebookresearch/NeRF-Det.

NeRF-Det：學習幾何感知體積表示以進行多視角3D物體檢測

NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

摘要

Support