PHUMA: Fysiek Gegronde Dataset voor Humanoïde Locomotion

Samenvatting

Bewegingsimitatie is een veelbelovende aanpak voor humanoïde voortbeweging, waarmee agents mensachtig gedrag kunnen aanleren. Bestaande methoden zijn doorgaans afhankelijk van hoogwaardige motion capture-datasets zoals AMASS, maar deze zijn schaars en duur, wat de schaalbaarheid en diversiteit beperkt. Recente onderzoeken proberen de gegevensverzameling op te schalen door grootschalige internetvideo's om te zetten, zoals geïllustreerd door Humanoid-X. Deze introduceren echter vaak fysieke artefacten zoals zweven, penetratie en voetglijden, wat stabiele imitatie belemmert. Als antwoord hierop introduceren wij PHUMA, een fysiek gefundeerde HUMAnoïde voortbewegingsdataset die gebruikmaakt van grootschalige menselijke video's, terwijl fysieke artefacten worden aangepakt via zorgvuldige datacuratie en fysica-gelimiteerd retargeten. PHUMA handhaaft gewrichtslimieten, zorgt voor grondcontact en elimineert voetglijden, waardoor bewegingen worden geproduceerd die zowel grootschalig als fysiek betrouwbaar zijn. Wij evalueerden PHUMA onder twee sets condities: (i) imitatie van onzichtbare beweging uit zelf-opgenomen testvideo's en (ii) padvolging met alleen bekkensturing. In beide gevallen presteren met PHUMA getrainde beleidsmodellen beter dan Humanoid-X en AMASS, met aanzienlijke verbeteringen in het imiteren van diverse bewegingen. De code is beschikbaar op https://davian-robotics.github.io/PHUMA.

English

Motion imitation is a promising approach for humanoid locomotion, enabling agents to acquire humanlike behaviors. Existing methods typically rely on high-quality motion capture datasets such as AMASS, but these are scarce and expensive, limiting scalability and diversity. Recent studies attempt to scale data collection by converting large-scale internet videos, exemplified by Humanoid-X. However, they often introduce physical artifacts such as floating, penetration, and foot skating, which hinder stable imitation. In response, we introduce PHUMA, a Physically-grounded HUMAnoid locomotion dataset that leverages human video at scale, while addressing physical artifacts through careful data curation and physics-constrained retargeting. PHUMA enforces joint limits, ensures ground contact, and eliminates foot skating, producing motions that are both large-scale and physically reliable. We evaluated PHUMA in two sets of conditions: (i) imitation of unseen motion from self-recorded test videos and (ii) path following with pelvis-only guidance. In both cases, PHUMA-trained policies outperform Humanoid-X and AMASS, achieving significant gains in imitating diverse motions. The code is available at https://davian-robotics.github.io/PHUMA.

PHUMA: Fysiek Gegronde Dataset voor Humanoïde Locomotion

PHUMA: Physically-Grounded Humanoid Locomotion Dataset

Samenvatting

Support