FocalFormer3D:針對3D物體檢測中的困難實例進行專注。
FocalFormer3D : Focusing on Hard Instance for 3D Object Detection
August 8, 2023
作者: Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Animashree Anandkumar, Jiaya Jia, Jose Alvarez
cs.AI
摘要
在自動駕駛中,3D物體檢測中的假陰性(FN),例如遺漏對行人、車輛或其他障礙物的預測,可能導致潛在危險情況。儘管具有致命性,但這個問題在許多當前的3D檢測方法中尚未受到充分研究。在本研究中,我們提出了Hard Instance Probing(HIP),這是一個通用流程,以多階段方式識別FN並引導模型專注於挖掘困難實例。對於3D物體檢測,我們將此方法具體化為FocalFormer3D,這是一個簡單而有效的檢測器,擅長挖掘困難對象並提高預測召回率。FocalFormer3D採用多階段查詢生成以發現困難對象,並採用框級Transformer解碼器以有效區分來自大量對象候選者的對象。在nuScenes和Waymo數據集上的實驗結果驗證了FocalFormer3D卓越的性能。這個優勢在檢測和跟踪以及LiDAR和多模態設置中都表現出色。值得注意的是,FocalFormer3D在nuScenes檢測基準上達到70.5 mAP和73.9 NDS,而nuScenes跟踪基準則顯示72.1 AMOTA,在nuScenes LiDAR排行榜上均排名第一。我們的代碼可在https://github.com/NVlabs/FocalFormer3D找到。
English
False negatives (FN) in 3D object detection, {\em e.g.}, missing predictions
of pedestrians, vehicles, or other obstacles, can lead to potentially dangerous
situations in autonomous driving. While being fatal, this issue is understudied
in many current 3D detection methods. In this work, we propose Hard Instance
Probing (HIP), a general pipeline that identifies FN in a multi-stage
manner and guides the models to focus on excavating difficult instances. For 3D
object detection, we instantiate this method as FocalFormer3D, a simple yet
effective detector that excels at excavating difficult objects and improving
prediction recall. FocalFormer3D features a multi-stage query generation to
discover hard objects and a box-level transformer decoder to efficiently
distinguish objects from massive object candidates. Experimental results on the
nuScenes and Waymo datasets validate the superior performance of FocalFormer3D.
The advantage leads to strong performance on both detection and tracking, in
both LiDAR and multi-modal settings. Notably, FocalFormer3D achieves a 70.5 mAP
and 73.9 NDS on nuScenes detection benchmark, while the nuScenes tracking
benchmark shows 72.1 AMOTA, both ranking 1st place on the nuScenes LiDAR
leaderboard. Our code is available at
https://github.com/NVlabs/FocalFormer3D.