HERMES:基于多源运动数据的人机交互式学习,用于移动灵巧操作
HERMES: Human-to-Robot Embodied Learning from Multi-Source Motion Data for Mobile Dexterous Manipulation
August 27, 2025
作者: Zhecheng Yuan, Tianming Wei, Langzhe Gu, Pu Hua, Tianhai Liang, Yuanpei Chen, Huazhe Xu
cs.AI
摘要
利用人体运动数据赋予机器人多样化操作技能,已成为机器人操控领域的一个极具前景的研究范式。然而,将多源人手运动转化为可行的机器人行为仍面临挑战,尤其是对于配备多指灵巧手的机器人而言,其动作空间复杂且高维。此外,现有方法往往难以生成能够适应多种环境条件的策略。本文提出HERMES,一个面向移动双手机器人灵巧操作的人机学习框架。首先,HERMES构建了一个统一的强化学习方法,能够无缝地将来自多源的异构人手运动转化为物理上合理的机器人行为。随后,为缩小仿真与现实的差距,我们设计了一种基于深度图像的端到端仿真到现实迁移方法,以提升对现实场景的泛化能力。再者,为了实现在多变且非结构化环境中的自主操作,我们在导航基础模型中融入了闭环的透视n点(PnP)定位机制,确保视觉目标的精确对齐,有效桥接自主导航与灵巧操作。大量实验结果表明,HERMES在多样化的实际场景中展现出良好的行为泛化能力,成功完成了多项复杂的移动双手灵巧操作任务。项目页面:https://gemcollector.github.io/HERMES/。
English
Leveraging human motion data to impart robots with versatile manipulation
skills has emerged as a promising paradigm in robotic manipulation.
Nevertheless, translating multi-source human hand motions into feasible robot
behaviors remains challenging, particularly for robots equipped with
multi-fingered dexterous hands characterized by complex, high-dimensional
action spaces. Moreover, existing approaches often struggle to produce policies
capable of adapting to diverse environmental conditions. In this paper, we
introduce HERMES, a human-to-robot learning framework for mobile bimanual
dexterous manipulation. First, HERMES formulates a unified reinforcement
learning approach capable of seamlessly transforming heterogeneous human hand
motions from multiple sources into physically plausible robotic behaviors.
Subsequently, to mitigate the sim2real gap, we devise an end-to-end, depth
image-based sim2real transfer method for improved generalization to real-world
scenarios. Furthermore, to enable autonomous operation in varied and
unstructured environments, we augment the navigation foundation model with a
closed-loop Perspective-n-Point (PnP) localization mechanism, ensuring precise
alignment of visual goals and effectively bridging autonomous navigation and
dexterous manipulation. Extensive experimental results demonstrate that HERMES
consistently exhibits generalizable behaviors across diverse, in-the-wild
scenarios, successfully performing numerous complex mobile bimanual dexterous
manipulation tasks. Project Page:https://gemcollector.github.io/HERMES/.