ChatPaper.aiChatPaper

迈向具备环境感知能力的机器人灵巧抓取:融入类人先验知识

Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors

August 12, 2025
作者: Haoyu Zhao, Linghao Zhuang, Xingyue Zhao, Cheng Zeng, Haoran Xu, Yuming Jiang, Jun Cen, Kexiang Wang, Jiayan Guo, Siteng Huang, Xin Li, Deli Zhao, Hua Zou
cs.AI

摘要

一只能够实现通用抓取的灵巧手,是发展通用型具身人工智能的基础。然而,现有方法大多局限于低层次的抓取稳定性指标,忽视了对于下游操作至关重要的功能感知定位与类人姿态。为解决这些局限,我们提出了AffordDex,一个采用两阶段训练的新颖框架,旨在学习一种兼具运动先验与物体功能理解的通用抓取策略。在第一阶段,通过在大规模人类手部运动数据上预训练轨迹模仿器,为自然运动注入强先验知识。第二阶段,训练一个残差模块,将这些通用的类人运动适配到特定物体实例上。这一精炼过程由两个关键组件引导:我们的负功能感知分割(NAA)模块,用于识别功能不恰当的接触区域;以及一个特权师生蒸馏过程,确保最终基于视觉的策略高度成功。大量实验表明,AffordDex不仅实现了通用的灵巧抓取,还在姿态上保持高度类人化,在接触位置上功能适宜。因此,AffordDex在已知物体、未见实例乃至全新类别上均显著超越了现有最先进的基线方法。
English
A dexterous hand capable of generalizable grasping objects is fundamental for the development of general-purpose embodied AI. However, previous methods focus narrowly on low-level grasp stability metrics, neglecting affordance-aware positioning and human-like poses which are crucial for downstream manipulation. To address these limitations, we propose AffordDex, a novel framework with two-stage training that learns a universal grasping policy with an inherent understanding of both motion priors and object affordances. In the first stage, a trajectory imitator is pre-trained on a large corpus of human hand motions to instill a strong prior for natural movement. In the second stage, a residual module is trained to adapt these general human-like motions to specific object instances. This refinement is critically guided by two components: our Negative Affordance-aware Segmentation (NAA) module, which identifies functionally inappropriate contact regions, and a privileged teacher-student distillation process that ensures the final vision-based policy is highly successful. Extensive experiments demonstrate that AffordDex not only achieves universal dexterous grasping but also remains remarkably human-like in posture and functionally appropriate in contact location. As a result, AffordDex significantly outperforms state-of-the-art baselines across seen objects, unseen instances, and even entirely novel categories.
PDF102August 13, 2025