RoboCat:一個用於機器人操作的自我改進基礎代理程式
RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation
June 20, 2023
作者: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Tom Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess
cs.AI
摘要
利用來自不同機器人和任務的異質機器人經驗,迅速掌握新技能和實體的能力,有潛力改變機器人學習。受到視覺和語言基礎模型的最新進展的啟發,我們提出了一個用於機器人操作的基礎代理。這個代理被命名為RoboCat,是一個視覺目標條件化的決策轉換器,能夠處理多實體動作標記的視覺經驗。這些數據涵蓋了從具有不同觀察和動作集的模擬和真實機器人手臂中獲得的大量運動控制技能。通過RoboCat,我們展示了對新任務和機器人的泛化能力,包括零樣本以及僅使用100至1000個目標任務示例進行適應。我們還展示了如何使用訓練過的模型本身來生成後續訓練迭代的數據,從而為自主改進迴圈提供了基本構建塊。我們研究了代理的能力,並在模擬環境和三種不同真實機器人實體上進行了大規模評估。我們發現,隨著訓練數據的擴展和多樣化,RoboCat不僅表現出跨任務轉移的跡象,還在適應新任務方面變得更加高效。
English
The ability to leverage heterogeneous robotic experience from different
robots and tasks to quickly master novel skills and embodiments has the
potential to transform robot learning. Inspired by recent advances in
foundation models for vision and language, we propose a foundation agent for
robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned
decision transformer capable of consuming multi-embodiment action-labelled
visual experience. This data spans a large repertoire of motor control skills
from simulated and real robotic arms with varying sets of observations and
actions. With RoboCat, we demonstrate the ability to generalise to new tasks
and robots, both zero-shot as well as through adaptation using only 100--1000
examples for the target task. We also show how a trained model itself can be
used to generate data for subsequent training iterations, thus providing a
basic building block for an autonomous improvement loop. We investigate the
agent's capabilities, with large-scale evaluations both in simulation and on
three different real robot embodiments. We find that as we grow and diversify
its training data, RoboCat not only shows signs of cross-task transfer, but
also becomes more efficient at adapting to new tasks.