顺序灵巧性:为长时程操作链接灵巧策略
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation
September 2, 2023
作者: Yuanpei Chen, Chen Wang, Li Fei-Fei, C. Karen Liu
cs.AI
摘要
许多现实世界的操作任务由一系列彼此显著不同的子任务组成。这种长时程、复杂任务凸显了灵巧手的潜力,它具有适应性和多功能性,能够在不需要重新抓取或外部工具的情况下无缝地在不同功能模式之间过渡。然而,由于灵巧手的高维动作空间和长时程任务的复杂组合动力学,挑战也随之而来。我们提出了顺序灵巧(Sequential Dexterity),这是一个基于强化学习(RL)的通用系统,用于链接多个灵巧策略以实现长时程任务目标。该系统的核心是一个逐步优化子策略以增强链接成功率的过渡可行性函数,同时还实现了自主策略切换以从失败中恢复并绕过冗余阶段。尽管仅在模拟环境中训练了几个任务对象,我们的系统展示了对新颖物体形状的泛化能力,并能够零-shot转移到配备灵巧手的真实世界机器人。更多详细信息和视频结果请访问https://sequential-dexterity.github.io。
English
Many real-world manipulation tasks consist of a series of subtasks that are
significantly different from one another. Such long-horizon, complex tasks
highlight the potential of dexterous hands, which possess adaptability and
versatility, capable of seamlessly transitioning between different modes of
functionality without the need for re-grasping or external tools. However, the
challenges arise due to the high-dimensional action space of dexterous hand and
complex compositional dynamics of the long-horizon tasks. We present Sequential
Dexterity, a general system based on reinforcement learning (RL) that chains
multiple dexterous policies for achieving long-horizon task goals. The core of
the system is a transition feasibility function that progressively finetunes
the sub-policies for enhancing chaining success rate, while also enables
autonomous policy-switching for recovery from failures and bypassing redundant
stages. Despite being trained only in simulation with a few task objects, our
system demonstrates generalization capability to novel object shapes and is
able to zero-shot transfer to a real-world robot equipped with a dexterous
hand. More details and video results could be found at
https://sequential-dexterity.github.io