顺序灵巧性：为长时程操作链接灵巧策略

摘要

许多现实世界的操作任务由一系列彼此显著不同的子任务组成。这种长时程、复杂任务凸显了灵巧手的潜力，它具有适应性和多功能性，能够在不需要重新抓取或外部工具的情况下无缝地在不同功能模式之间过渡。然而，由于灵巧手的高维动作空间和长时程任务的复杂组合动力学，挑战也随之而来。我们提出了顺序灵巧（Sequential Dexterity），这是一个基于强化学习（RL）的通用系统，用于链接多个灵巧策略以实现长时程任务目标。该系统的核心是一个逐步优化子策略以增强链接成功率的过渡可行性函数，同时还实现了自主策略切换以从失败中恢复并绕过冗余阶段。尽管仅在模拟环境中训练了几个任务对象，我们的系统展示了对新颖物体形状的泛化能力，并能够零-shot转移到配备灵巧手的真实世界机器人。更多详细信息和视频结果请访问https://sequential-dexterity.github.io。

English

Many real-world manipulation tasks consist of a series of subtasks that are significantly different from one another. Such long-horizon, complex tasks highlight the potential of dexterous hands, which possess adaptability and versatility, capable of seamlessly transitioning between different modes of functionality without the need for re-grasping or external tools. However, the challenges arise due to the high-dimensional action space of dexterous hand and complex compositional dynamics of the long-horizon tasks. We present Sequential Dexterity, a general system based on reinforcement learning (RL) that chains multiple dexterous policies for achieving long-horizon task goals. The core of the system is a transition feasibility function that progressively finetunes the sub-policies for enhancing chaining success rate, while also enables autonomous policy-switching for recovery from failures and bypassing redundant stages. Despite being trained only in simulation with a few task objects, our system demonstrates generalization capability to novel object shapes and is able to zero-shot transfer to a real-world robot equipped with a dexterous hand. More details and video results could be found at https://sequential-dexterity.github.io

顺序灵巧性：为长时程操作链接灵巧策略

Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

摘要

Support