ChatPaper.aiChatPaper

迈向具备人类先验知识的可操作感知机器人灵巧抓取

Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors

August 12, 2025
作者: Haoyu Zhao, Linghao Zhuang, Xingyue Zhao, Cheng Zeng, Haoran Xu, Yuming Jiang, Jun Cen, Kexiang Wang, Jiayan Guo, Siteng Huang, Xin Li, Deli Zhao, Hua Zou
cs.AI

摘要

一隻能夠通用抓取物件的靈巧手,對於開發通用型具身人工智慧至關重要。然而,以往的方法過於專注於低層次的抓取穩定性指標,忽略了對下游操作至關重要的功能感知定位和類人姿態。為解決這些局限,我們提出了AffordDex,這是一個具有兩階段訓練的新框架,能夠學習一種內在理解運動先驗和物體功能性的通用抓取策略。在第一階段,軌跡模仿器在大量人手運動數據上進行預訓練,以注入自然運動的強先驗。在第二階段,一個殘差模塊被訓練來將這些通用的類人運動適應於特定物體實例。這一精煉過程由兩個關鍵組件引導:我們的負功能感知分割(NAA)模塊,它識別功能不當的接觸區域;以及一個特權師生蒸餾過程,確保最終基於視覺的策略高度成功。大量實驗表明,AffordDex不僅實現了通用的靈巧抓取,而且在姿態上極為類人,在接觸位置上功能適宜。因此,AffordDex在已見物體、未見實例甚至全新類別上均顯著優於最先進的基線方法。
English
A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AffordDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AffordDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AffordDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.
PDF102August 13, 2025