在视觉机器人操作的模仿学习中分解泛化差距
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation
July 7, 2023
作者: Annie Xie, Lisa Lee, Ted Xiao, Chelsea Finn
cs.AI
摘要
在视觉机器人操作中,模仿学习的泛化为何如此困难?这个问题表面上很难解决,但从机器人的视角来看,环境往往可以分解为可数的变化因素,比如光照条件或摄像头的位置。从经验上看,对其中一些因素的泛化比其他因素更具挑战性,但现有研究对每个因素对泛化差距的贡献程度几乎没有提供明确的线索。为了回答这个问题,我们研究了模拟中的模仿学习策略以及在真实机器人上进行了基于语言的操作任务,以量化对不同(组合的)因素的泛化难度。我们还设计了一个新的模拟基准测试,包括19个任务和11个变化因素,以促进更可控的泛化评估。通过我们的研究,我们确定了一个基于泛化难度的因素排序,这个排序在模拟和我们的真实机器人设置中是一致的。
English
What makes generalization hard for imitation learning in visual robotic
manipulation? This question is difficult to approach at face value, but the
environment from the perspective of a robot can often be decomposed into
enumerable factors of variation, such as the lighting conditions or the
placement of the camera. Empirically, generalization to some of these factors
have presented a greater obstacle than others, but existing work sheds little
light on precisely how much each factor contributes to the generalization gap.
Towards an answer to this question, we study imitation learning policies in
simulation and on a real robot language-conditioned manipulation task to
quantify the difficulty of generalization to different (sets of) factors. We
also design a new simulated benchmark of 19 tasks with 11 factors of variation
to facilitate more controlled evaluations of generalization. From our study, we
determine an ordering of factors based on generalization difficulty, that is
consistent across simulation and our real robot setup.