ChatPaper.aiChatPaper

OmniNOCS:用於將2D物體提升至3D的統一NOCS數據集與模型

OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects

July 11, 2024
作者: Akshay Krishnan, Abhijit Kundu, Kevis-Kokitsi Maninis, James Hays, Matthew Brown
cs.AI

摘要

我們提出了 OmniNOCS,這是一個大規模的單眼資料集,具有 3D 正規化物體座標空間 (NOCS) 地圖、物體遮罩和室內外場景的 3D 邊界框標註。OmniNOCS 擁有比現有的 NOCS 資料集 (NOCS-Real275、Wild6D) 更多 20 倍的物體類別和 200 倍的實例。我們使用 OmniNOCS 來訓練一個新穎的基於變壓器的單眼 NOCS 預測模型 (NOCSformer),該模型可以從跨不同類別的 2D 物體檢測中預測準確的 NOCS、實例遮罩和姿勢。這是第一個能夠在提示 2D 邊界框時泛化到廣泛類別的 NOCS 模型。我們在 3D 定向邊界框預測任務上評估我們的模型,其實現了與 Cube R-CNN 等最先進的 3D 檢測方法相當的結果。與其他 3D 檢測方法不同,我們的模型還提供了詳細和準確的 3D 物體形狀和分割。我們提出了一個基於 OmniNOCS 的 NOCS 預測任務的新穎基準,希望這將成為該領域未來工作的有用基準。我們的資料集和程式碼將在項目網站上提供:https://omninocs.github.io。
English
We propose OmniNOCS, a large-scale monocular dataset with 3D Normalized Object Coordinate Space (NOCS) maps, object masks, and 3D bounding box annotations for indoor and outdoor scenes. OmniNOCS has 20 times more object classes and 200 times more instances than existing NOCS datasets (NOCS-Real275, Wild6D). We use OmniNOCS to train a novel, transformer-based monocular NOCS prediction model (NOCSformer) that can predict accurate NOCS, instance masks and poses from 2D object detections across diverse classes. It is the first NOCS model that can generalize to a broad range of classes when prompted with 2D boxes. We evaluate our model on the task of 3D oriented bounding box prediction, where it achieves comparable results to state-of-the-art 3D detection methods such as Cube R-CNN. Unlike other 3D detection methods, our model also provides detailed and accurate 3D object shape and segmentation. We propose a novel benchmark for the task of NOCS prediction based on OmniNOCS, which we hope will serve as a useful baseline for future work in this area. Our dataset and code will be at the project website: https://omninocs.github.io.

Summary

AI-Generated Summary

PDF92November 28, 2024