TeamHOI：任意のチームサイズにおける協調的人間-物体インタラクションの統一ポリシーの学習

要旨

物理ベースのヒューマノイド制御は、現実的で高性能な単一エージェントの動作実現において著しい進歩を遂げているが、これらの能力を協調的な人物-物体インタラクション（HOI）に拡張することは依然として課題である。本論文では、単一の分散型ポリシーが任意の数の協調エージェントにわたる協調的HOIを扱うことを可能にするフレームワーク、TeamHOIを提案する。各エージェントは局所観測を用いて動作すると同時に、Transformerベースのポリシーネットワーク内のチームメートトークンを介して他のチームメートに注意を向けることで、チームサイズが変動しても拡張可能な協調を実現する。協調的HOIデータの不足という課題に対処しつつ動作の現実性を確保するため、単一人物の参照動作を使用し、訓練中に対象物と相互作用する身体部位をマスクするマスク付き敵対的動作事前分布（AMP）戦略をさらに導入する。マスクされた領域は、タスク報酬を通じて多様で物理的に妥当な協調行動を生成するように導かれる。TeamHOIを、2体から8体のヒューマノイドエージェントと様々な形状の物体が関わる困難な協調運搬タスクで評価する。最後に、安定した運搬を促進するため、チームサイズと形状に依存しない隊形報酬を設計する。TeamHOIは高い成功率を達成し、単一のポリシーで多様な構成にわたって一貫した協調動作を示す。

English

Physics-based humanoid control has achieved remarkable progress in enabling realistic and high-performing single-agent behaviors, yet extending these capabilities to cooperative human-object interaction (HOI) remains challenging. We present TeamHOI, a framework that enables a single decentralized policy to handle cooperative HOIs across any number of cooperating agents. Each agent operates using local observations while attending to other teammates through a Transformer-based policy network with teammate tokens, allowing scalable coordination across variable team sizes. To enforce motion realism while addressing the scarcity of cooperative HOI data, we further introduce a masked Adversarial Motion Prior (AMP) strategy that uses single-human reference motions while masking object-interacting body parts during training. The masked regions are then guided through task rewards to produce diverse and physically plausible cooperative behaviors. We evaluate TeamHOI on a challenging cooperative carrying task involving two to eight humanoid agents and varied object geometries. Finally, to promote stable carrying, we design a team-size- and shape-agnostic formation reward. TeamHOI achieves high success rates and demonstrates coherent cooperation across diverse configurations with a single policy.

TeamHOI：任意のチームサイズにおける協調的人間-物体インタラクションの統一ポリシーの学習

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

要旨

Support