BEHAVIOR Vision Suite:通過模擬進行可定制數據集生成
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
May 15, 2024
作者: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu
cs.AI
摘要
在不同條件下系統評估和理解計算機視覺模型需要大量具有全面和定制標籤的數據,而現實世界的視覺數據集很少能滿足這一需求。儘管當前的合成數據生成器為具體化人工智能任務提供了一個有前途的替代方案,但由於資產和渲染質量不高、多樣性有限和物理特性不現實,它們在計算機視覺任務中往往表現不佳。我們介紹了BEHAVIOR Vision Suite(BVS),這是一套工具和資產,用於生成全面定制的合成數據,以系統評估計算機視覺模型,基於新開發的具體化人工智能基準BEHAVIOR-1K。BVS支持在場景級別(例如照明、物體放置)、物體級別(例如聯合配置、屬性如“填充”和“折疊”)和攝像頭級別(例如視野、焦距)上調整大量參數。研究人員可以在數據生成過程中任意變化這些參數以進行受控實驗。我們展示了三個應用場景示例:系統評估模型在不同連續域轉移軸上的穩健性、在相同一組圖像上評估場景理解模型,以及為一個新的視覺任務進行訓練和評估模擬到真實的轉移:單一和二元狀態預測。項目網站:https://behavior-vision-suite.github.io/
English
The systematic evaluation and understanding of computer vision models under
varying conditions require large amounts of data with comprehensive and
customized labels, which real-world vision datasets rarely satisfy. While
current synthetic data generators offer a promising alternative, particularly
for embodied AI tasks, they often fall short for computer vision tasks due to
low asset and rendering quality, limited diversity, and unrealistic physical
properties. We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and
assets to generate fully customized synthetic data for systematic evaluation of
computer vision models, based on the newly developed embodied AI benchmark,
BEHAVIOR-1K. BVS supports a large number of adjustable parameters at the scene
level (e.g., lighting, object placement), the object level (e.g., joint
configuration, attributes such as "filled" and "folded"), and the camera level
(e.g., field of view, focal length). Researchers can arbitrarily vary these
parameters during data generation to perform controlled experiments. We
showcase three example application scenarios: systematically evaluating the
robustness of models across different continuous axes of domain shift,
evaluating scene understanding models on the same set of images, and training
and evaluating simulation-to-real transfer for a novel vision task: unary and
binary state prediction. Project website:
https://behavior-vision-suite.github.io/