揭秘机器人操作策略的动作空间设计

摘要

动作空间的规范设定在基于模仿的机器人操作策略学习中具有关键作用，其从根本上塑造了策略学习的优化格局。尽管近期研究重点主要集中于扩展训练数据和模型容量，但动作空间的选择仍遵循临时启发式规则或传统设计，导致对机器人策略设计理念的理解存在模糊性。为厘清这一模糊认知，我们开展了大规模系统性实证研究，证实动作空间确实对机器人策略学习存在显著而复杂的影响。我们沿时间和空间维度剖析动作设计空间，从而结构化分析这些选择如何影响策略可学习性和控制稳定性。基于双手机器人上13,000多次真实环境推演及四类场景中500多个训练模型的评估，我们对比了绝对表征与差分表征、关节空间与任务空间参数化的权衡关系。大规模实验结果表明：设计策略预测差分动作能持续提升性能，而关节空间与任务空间表征具有互补优势——前者倾向于控制稳定性，后者更利于泛化能力。

English

The specification of the action space plays a pivotal role in imitation-based robotic manipulation policy learning, fundamentally shaping the optimization landscape of policy learning. While recent advances have focused heavily on scaling training data and model capacity, the choice of action space remains guided by ad-hoc heuristics or legacy designs, leading to an ambiguous understanding of robotic policy design philosophies. To address this ambiguity, we conducted a large-scale and systematic empirical study, confirming that the action space does have significant and complex impacts on robotic policy learning. We dissect the action design space along temporal and spatial axes, facilitating a structured analysis of how these choices govern both policy learnability and control stability. Based on 13,000+ real-world rollouts on a bimanual robot and evaluation on 500+ trained models over four scenarios, we examine the trade-offs between absolute vs. delta representations, and joint-space vs. task-space parameterizations. Our large-scale results suggest that properly designing the policy to predict delta actions consistently improves performance, while joint-space and task-space representations offer complementary strengths, favoring control stability and generalization, respectively.

揭秘机器人操作策略的动作空间设计

Demystifying Action Space Design for Robotic Manipulation Policies

摘要

Support