行为视觉套件:通过模拟实现可定制数据集生成
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
May 15, 2024
作者: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu
cs.AI
摘要
在不同条件下系统评估和理解计算机视觉模型需要大量具有全面和定制标签的数据,而真实世界的视觉数据集很少能满足这一需求。虽然当前的合成数据生成器为此提供了一种有前途的替代方案,特别是对于具身人工智能任务,但由于资产和渲染质量低、多样性有限以及物理属性不真实,它们在计算机视觉任务中经常表现不佳。我们介绍了BEHAVIOR Vision Suite(BVS),这是一组工具和资产,用于生成完全定制的合成数据,以系统评估计算机视觉模型,基于新开发的具身人工智能基准测试BEHAVIOR-1K。BVS支持在场景级别(例如,光照、物体放置)、物体级别(例如,关节配置、属性如“填充”和“折叠”)和相机级别(例如,视场、焦距)上调整大量参数。研究人员可以在数据生成过程中任意变化这些参数,以进行受控实验。我们展示了三个示例应用场景:在不同连续领域转移轴上系统评估模型的鲁棒性,对相同一组图像评估场景理解模型,以及训练和评估模拟到真实的转移,用于一项新的视觉任务:一元和二元状态预测。项目网站:https://behavior-vision-suite.github.io/
English
The systematic evaluation and understanding of computer vision models under
varying conditions require large amounts of data with comprehensive and
customized labels, which real-world vision datasets rarely satisfy. While
current synthetic data generators offer a promising alternative, particularly
for embodied AI tasks, they often fall short for computer vision tasks due to
low asset and rendering quality, limited diversity, and unrealistic physical
properties. We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and
assets to generate fully customized synthetic data for systematic evaluation of
computer vision models, based on the newly developed embodied AI benchmark,
BEHAVIOR-1K. BVS supports a large number of adjustable parameters at the scene
level (e.g., lighting, object placement), the object level (e.g., joint
configuration, attributes such as "filled" and "folded"), and the camera level
(e.g., field of view, focal length). Researchers can arbitrarily vary these
parameters during data generation to perform controlled experiments. We
showcase three example application scenarios: systematically evaluating the
robustness of models across different continuous axes of domain shift,
evaluating scene understanding models on the same set of images, and training
and evaluating simulation-to-real transfer for a novel vision task: unary and
binary state prediction. Project website:
https://behavior-vision-suite.github.io/Summary
AI-Generated Summary