ChatPaper.aiChatPaper

SkillBlender:基於技能融合的通用人形機器人全身運動操作系統

SkillBlender: Towards Versatile Humanoid Whole-Body Loco-Manipulation via Skill Blending

June 11, 2025
作者: Yuxuan Kuang, Haoran Geng, Amine Elhafsi, Tan-Dzung Do, Pieter Abbeel, Jitendra Malik, Marco Pavone, Yue Wang
cs.AI

摘要

人形機器人憑藉其靈活性和類人形態,在完成多樣化環境中的日常任務方面具有顯著潛力。近期研究利用最優控制或強化學習,在人形機器人的全身控制與移動操作方面取得了重大進展。然而,這些方法需要針對每項任務進行繁瑣的特定調校以達到滿意的行為表現,這限制了它們在日常場景中多樣化任務的通用性和可擴展性。為此,我們引入了SkillBlender,這是一種新穎的分層強化學習框架,旨在實現多功能的人形機器人移動操作。SkillBlender首先預訓練目標條件下的任務無關基礎技能,然後動態融合這些技能,以最少的任務特定獎勵工程完成複雜的移動操作任務。我們還推出了SkillBench,這是一個並行、跨具身、多樣化的模擬基準,包含三種具身形式、四種基礎技能和八項具有挑戰性的移動操作任務,並配有一套平衡準確性和可行性的科學評估指標。大量模擬實驗表明,我們的方法顯著優於所有基線,同時自然地規範行為以避免獎勵欺騙,從而為日常場景中的多樣化移動操作任務帶來更準確和可行的動作。我們的代碼和基準將開源給社區,以促進未來的研究。項目頁面:https://usc-gvl.github.io/SkillBlender-web/。
English
Humanoid robots hold significant potential in accomplishing daily tasks across diverse environments thanks to their flexibility and human-like morphology. Recent works have made significant progress in humanoid whole-body control and loco-manipulation leveraging optimal control or reinforcement learning. However, these methods require tedious task-specific tuning for each task to achieve satisfactory behaviors, limiting their versatility and scalability to diverse tasks in daily scenarios. To that end, we introduce SkillBlender, a novel hierarchical reinforcement learning framework for versatile humanoid loco-manipulation. SkillBlender first pretrains goal-conditioned task-agnostic primitive skills, and then dynamically blends these skills to accomplish complex loco-manipulation tasks with minimal task-specific reward engineering. We also introduce SkillBench, a parallel, cross-embodiment, and diverse simulated benchmark containing three embodiments, four primitive skills, and eight challenging loco-manipulation tasks, accompanied by a set of scientific evaluation metrics balancing accuracy and feasibility. Extensive simulated experiments show that our method significantly outperforms all baselines, while naturally regularizing behaviors to avoid reward hacking, resulting in more accurate and feasible movements for diverse loco-manipulation tasks in our daily scenarios. Our code and benchmark will be open-sourced to the community to facilitate future research. Project page: https://usc-gvl.github.io/SkillBlender-web/.
PDF62June 16, 2025