ChatPaper.aiChatPaper

朝着协同、泛化和高效的双系统机器人操作系统迈进

Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

October 10, 2024
作者: Qingwen Bu, Hongyang Li, Li Chen, Jisong Cai, Jia Zeng, Heming Cui, Maoqing Yao, Yu Qiao
cs.AI

摘要

对于在多样化和动态环境中运作的多功能机器人系统的需求不断增加,强调了通用政策的重要性,该政策利用大规模的跨体数据语料库,以促进广泛适应性和高级推理。然而,通用政策在推理效率和训练成本昂贵方面存在困难。相反,专家政策专为特定领域数据而设计,在任务级别精度和效率方面表现出色。然而,它缺乏适用于各种应用的泛化能力。受到这些观察的启发,我们介绍了RoboDual,这是一个协同双系统,补充了通用政策和专家政策的优点。基于扩散变压器的专家系统被设计用于多步骤动作展开,精心调整高级任务理解和基于视觉-语言-动作(VLA)的通用政策的离散动作输出。与OpenVLA相比,RoboDual在现实世界设置中取得了26.7%的改进,并通过引入一个仅具有20M可训练参数的专家政策,在CALVIN上获得了12%的增益。它仅使用5%的演示数据就保持了强大的性能,并在现实世界部署中实现了3.8倍更高的控制频率。代码将公开发布。我们的项目页面托管在:https://opendrivelab.com/RoboDual/
English
The increasing demand for versatile robotic systems to operate in diverse and dynamic environments has emphasized the importance of a generalist policy, which leverages a large cross-embodiment data corpus to facilitate broad adaptability and high-level reasoning. However, the generalist would struggle with inefficient inference and cost-expensive training. The specialist policy, instead, is curated for specific domain data and excels at task-level precision with efficiency. Yet, it lacks the generalization capacity for a wide range of applications. Inspired by these observations, we introduce RoboDual, a synergistic dual-system that supplements the merits of both generalist and specialist policy. A diffusion transformer-based specialist is devised for multi-step action rollouts, exquisitely conditioned on the high-level task understanding and discretized action output of a vision-language-action (VLA) based generalist. Compared to OpenVLA, RoboDual achieves 26.7% improvement in real-world setting and 12% gain on CALVIN by introducing a specialist policy with merely 20M trainable parameters. It maintains strong performance with 5% of demonstration data only, and enables a 3.8 times higher control frequency in real-world deployment. Code would be made publicly available. Our project page is hosted at: https://opendrivelab.com/RoboDual/
PDF42November 16, 2024