對於視覺機器人操作的模仿學習中的泛化差距分解

摘要

在視覺機器人操作中，模仿學習為何讓泛化變得困難？這個問題一開始很難著手處理，但從機器人的角度來看，環境通常可以分解為可數的變化因素，例如照明條件或攝影機的位置。從經驗上看，對某些因素的泛化比其他因素更具挑戰性，但現有的研究對於每個因素對泛化差距的貢獻程度幾乎沒有提供明確的信息。為了回答這個問題，我們研究了模擬環境中和真實機器人上的語言條件操作任務中的模仿學習策略，以量化對不同（組合的）因素進行泛化的困難程度。我們還設計了一個新的模擬基準測試，包含19個任務和11個變化因素，以便更有控制地評估泛化能力。從我們的研究中，我們確定了一個基於泛化困難程度的因素排序，這個排序在模擬環境和我們的真實機器人設置中是一致的。

English

What makes generalization hard for imitation learning in visual robotic manipulation? This question is difficult to approach at face value, but the environment from the perspective of a robot can often be decomposed into enumerable factors of variation, such as the lighting conditions or the placement of the camera. Empirically, generalization to some of these factors have presented a greater obstacle than others, but existing work sheds little light on precisely how much each factor contributes to the generalization gap. Towards an answer to this question, we study imitation learning policies in simulation and on a real robot language-conditioned manipulation task to quantify the difficulty of generalization to different (sets of) factors. We also design a new simulated benchmark of 19 tasks with 11 factors of variation to facilitate more controlled evaluations of generalization. From our study, we determine an ordering of factors based on generalization difficulty, that is consistent across simulation and our real robot setup.

對於視覺機器人操作的模仿學習中的泛化差距分解

Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation

摘要

Support