RoboCat:用于机器人操作的自我改进基础代理
RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation
June 20, 2023
作者: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Tom Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess
cs.AI
摘要
利用来自不同机器人和任务的异构机器人经验迅速掌握新技能和实体的能力有可能改变机器人学习。受到视觉和语言基础模型的最新进展的启发,我们提出了一个用于机器人操作的基础代理。这个代理被命名为RoboCat,是一个视觉目标条件的决策变换器,能够处理多实体动作标记的视觉经验。这些数据涵盖了来自模拟和真实机器人手臂的大量运动控制技能,观察和动作集各异。通过RoboCat,我们展示了其能够泛化到新任务和机器人,包括零样本学习以及仅使用100-1000个示例进行目标任务的适应。我们还展示了如何使用训练好的模型生成数据以供后续训练迭代使用,从而为自主改进循环提供了一个基本构建模块。我们研究了代理的能力,在模拟环境和三种不同真实机器人实体上进行了大规模评估。我们发现,随着训练数据的增长和多样化,RoboCat不仅显示出跨任务迁移的迹象,而且在适应新任务时变得更加高效。
English
The ability to leverage heterogeneous robotic experience from different
robots and tasks to quickly master novel skills and embodiments has the
potential to transform robot learning. Inspired by recent advances in
foundation models for vision and language, we propose a foundation agent for
robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned
decision transformer capable of consuming multi-embodiment action-labelled
visual experience. This data spans a large repertoire of motor control skills
from simulated and real robotic arms with varying sets of observations and
actions. With RoboCat, we demonstrate the ability to generalise to new tasks
and robots, both zero-shot as well as through adaptation using only 100--1000
examples for the target task. We also show how a trained model itself can be
used to generate data for subsequent training iterations, thus providing a
basic building block for an autonomous improvement loop. We investigate the
agent's capabilities, with large-scale evaluations both in simulation and on
three different real robot embodiments. We find that as we grow and diversify
its training data, RoboCat not only shows signs of cross-task transfer, but
also becomes more efficient at adapting to new tasks.