八爪鱼:一个开源的通用机器人策略
Octo: An Open-Source Generalist Robot Policy
May 20, 2024
作者: Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, You Liang Tan, Pannag Sanketi, Quan Vuong, Ted Xiao, Dorsa Sadigh, Chelsea Finn, Sergey Levine
cs.AI
摘要
在各种机器人数据集上预训练的大型策略具有改变机器人学习的潜力:与从头开始训练新策略不同,这种通用机器人策略可以仅通过少量领域内数据进行微调,却能广泛泛化。然而,为了在各种机器人学习场景、环境和任务中广泛适用,这些策略需要处理多样的传感器和动作空间,适应各种常用的机器人平台,并且能够快速高效地在新领域进行微调。在这项工作中,我们旨在为开发面向机器人操作的开源、广泛适用的通用策略奠定基础。作为第一步,我们介绍了Octo,这是一个基于大型Transformer的策略,通过对迄今为止最大的机器人操作数据集Open X-Embodiment中的800k条轨迹进行训练而得到。它可以通过语言命令或目标图像进行指导,并且可以在标准消费级GPU上在几小时内有效地对具有新感知输入和动作空间的机器人设置进行微调。在对9个机器人平台进行的实验中,我们展示了Octo作为一种多才多艺的策略初始化,可以有效地微调到新的观察和动作空间。我们还对Octo模型的设计决策进行了详细的消融分析,从架构到训练数据,以指导未来构建通用机器人模型的研究。
English
Large policies pretrained on diverse robot datasets have the potential to
transform robotic learning: instead of training new policies from scratch, such
generalist robot policies may be finetuned with only a little in-domain data,
yet generalize broadly. However, to be widely applicable across a range of
robotic learning scenarios, environments, and tasks, such policies need to
handle diverse sensors and action spaces, accommodate a variety of commonly
used robotic platforms, and finetune readily and efficiently to new domains. In
this work, we aim to lay the groundwork for developing open-source, widely
applicable, generalist policies for robotic manipulation. As a first step, we
introduce Octo, a large transformer-based policy trained on 800k trajectories
from the Open X-Embodiment dataset, the largest robot manipulation dataset to
date. It can be instructed via language commands or goal images and can be
effectively finetuned to robot setups with new sensory inputs and action spaces
within a few hours on standard consumer GPUs. In experiments across 9 robotic
platforms, we demonstrate that Octo serves as a versatile policy initialization
that can be effectively finetuned to new observation and action spaces. We also
perform detailed ablations of design decisions for the Octo model, from
architecture to training data, to guide future research on building generalist
robot models.Summary
AI-Generated Summary