連續靈巧性:連鎖靈巧策略以應對長時間範疇操作
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation
September 2, 2023
作者: Yuanpei Chen, Chen Wang, Li Fei-Fei, C. Karen Liu
cs.AI
摘要
許多現實世界的操作任務由一系列彼此顯著不同的子任務組成。這種長視程、複雜任務凸顯了靈巧手的潛力,具有適應性和多功能性,能夠在不需要重新抓握或外部工具的情況下無縫地在不同功能模式之間過渡。然而,由於靈巧手的高維動作空間和長視程任務的複雜組合動力學,挑戰也隨之而來。我們提出了Sequential Dexterity,這是一個基於強化學習(RL)的通用系統,用於鏈接多個靈巧策略以實現長視程任務目標。該系統的核心是一個過渡可行性函數,逐步微調子策略以增強鏈接成功率,同時實現自主策略切換以從失敗中恢復並繞過多餘階段。儘管僅在模擬環境中訓練了幾個任務對象,我們的系統展示了對新物體形狀的泛化能力,並能夠零-shot轉移到配備靈巧手的現實世界機器人上。更多詳細信息和視頻結果可在https://sequential-dexterity.github.io找到。
English
Many real-world manipulation tasks consist of a series of subtasks that are
significantly different from one another. Such long-horizon, complex tasks
highlight the potential of dexterous hands, which possess adaptability and
versatility, capable of seamlessly transitioning between different modes of
functionality without the need for re-grasping or external tools. However, the
challenges arise due to the high-dimensional action space of dexterous hand and
complex compositional dynamics of the long-horizon tasks. We present Sequential
Dexterity, a general system based on reinforcement learning (RL) that chains
multiple dexterous policies for achieving long-horizon task goals. The core of
the system is a transition feasibility function that progressively finetunes
the sub-policies for enhancing chaining success rate, while also enables
autonomous policy-switching for recovery from failures and bypassing redundant
stages. Despite being trained only in simulation with a few task objects, our
system demonstrates generalization capability to novel object shapes and is
able to zero-shot transfer to a real-world robot equipped with a dexterous
hand. More details and video results could be found at
https://sequential-dexterity.github.io