OmniRetarget：面向人形機器人全身運動操作與場景互動的交互保持數據生成

摘要

目前，教導人形機器人掌握複雜技能的主流方法是將人類動作重新定位為運動學參考，用以訓練強化學習（RL）策略。然而，現有的重新定位流程往往難以應對人類與機器人之間顯著的形體差異，導致產生如腳部滑動和穿透等物理上不合理的現象。更重要的是，常見的重新定位方法忽略了豐富的人-物和人-環境互動，這些互動對於表達性移動和移動操作至關重要。為解決這一問題，我們引入了OmniRetarget，這是一個基於互動網格的數據生成引擎，它能夠明確建模並保留代理、地形及操作對象之間關鍵的空間和接觸關係。通過最小化人體與機器人網格之間的拉普拉斯變形，同時施加運動學約束，OmniRetarget生成運動學上可行的軌跡。此外，保留任務相關的互動使得從單一示範到不同機器人形體、地形和物體配置的高效數據擴增成為可能。我們全面評估了OmniRetarget，通過重新定位來自OMOMO、LAFAN1及我們內部MoCap數據集的動作，生成了超過8小時的軌跡，這些軌跡在運動學約束滿足度和接觸保持方面均優於廣泛使用的基準方法。如此高質量的數據使得本體感知RL策略能夠在Unitree G1人形機器人上成功執行長時限（長達30秒）的跑酷和移動操作技能，僅使用5個獎勵項和所有任務共享的簡單領域隨機化進行訓練，無需任何學習課程。

English

A dominant paradigm for teaching humanoid robots complex skills is to retarget human motions as kinematic references to train reinforcement learning (RL) policies. However, existing retargeting pipelines often struggle with the significant embodiment gap between humans and robots, producing physically implausible artifacts like foot-skating and penetration. More importantly, common retargeting methods neglect the rich human-object and human-environment interactions essential for expressive locomotion and loco-manipulation. To address this, we introduce OmniRetarget, an interaction-preserving data generation engine based on an interaction mesh that explicitly models and preserves the crucial spatial and contact relationships between an agent, the terrain, and manipulated objects. By minimizing the Laplacian deformation between the human and robot meshes while enforcing kinematic constraints, OmniRetarget generates kinematically feasible trajectories. Moreover, preserving task-relevant interactions enables efficient data augmentation, from a single demonstration to different robot embodiments, terrains, and object configurations. We comprehensively evaluate OmniRetarget by retargeting motions from OMOMO, LAFAN1, and our in-house MoCap datasets, generating over 8-hour trajectories that achieve better kinematic constraint satisfaction and contact preservation than widely used baselines. Such high-quality data enables proprioceptive RL policies to successfully execute long-horizon (up to 30 seconds) parkour and loco-manipulation skills on a Unitree G1 humanoid, trained with only 5 reward terms and simple domain randomization shared by all tasks, without any learning curriculum.

OmniRetarget：面向人形機器人全身運動操作與場景互動的交互保持數據生成

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

摘要

Support