Het Ontrafelen van Actieruimte-ontwerp voor Beleidsregels voor Robotmanipulatie

Samenvatting

De specificatie van de actieruimte speelt een cruciale rol bij imitatiegebaseerd leren van robotmanipulatiebeleid, omdat deze de optimalisatielandschap van beleidsleren fundamenteel vormgeeft. Hoewel recente vooruitgang zich sterk heeft gericht op het schalen van trainingsdata en modelcapaciteit, blijft de keuze van de actieruimte gestuurd door ad-hoc heuristieken of verouderde ontwerpen, wat leidt tot een ambigu begrip van robotisch beleidsontwerp. Om deze ambiguïteit aan te pakken, voerden we een grootschalige en systematische empirische studie uit, die bevestigt dat de actieruimte inderdaad significante en complexe effecten heeft op robotisch beleidsleren. We ontleden de actieontwerpruimte langs temporele en ruimtelijke assen, wat een gestructureerde analyse mogelijk maakt van hoe deze keuzes zowel de leerbaarheid van het beleid als de controle stabiliteit beïnvloeden. Gebaseerd op meer dan 13.000 rollouts in de echte wereld op een bimanuele robot en evaluatie van meer dan 500 getrainde modellen in vier scenario's, onderzoeken we de afwegingen tussen absolute versus delta-representaties, en parameterisaties in gewrichtsruimte versus taakruimte. Onze grootschalige resultaten suggereren dat een correct ontworpen beleid om delta-acties te voorspellen consistent de prestaties verbetert, terwijl gewrichtsruimte- en taakruimte-representaties complementaire sterke punten bieden, die respectievelijk controle stabiliteit en generalisatie bevorderen.

English

The specification of the action space plays a pivotal role in imitation-based robotic manipulation policy learning, fundamentally shaping the optimization landscape of policy learning. While recent advances have focused heavily on scaling training data and model capacity, the choice of action space remains guided by ad-hoc heuristics or legacy designs, leading to an ambiguous understanding of robotic policy design philosophies. To address this ambiguity, we conducted a large-scale and systematic empirical study, confirming that the action space does have significant and complex impacts on robotic policy learning. We dissect the action design space along temporal and spatial axes, facilitating a structured analysis of how these choices govern both policy learnability and control stability. Based on 13,000+ real-world rollouts on a bimanual robot and evaluation on 500+ trained models over four scenarios, we examine the trade-offs between absolute vs. delta representations, and joint-space vs. task-space parameterizations. Our large-scale results suggest that properly designing the policy to predict delta actions consistently improves performance, while joint-space and task-space representations offer complementary strengths, favoring control stability and generalization, respectively.

Het Ontrafelen van Actieruimte-ontwerp voor Beleidsregels voor Robotmanipulatie

Demystifying Action Space Design for Robotic Manipulation Policies

Samenvatting

Support