通用机器人策略中的捷径学习：数据集多样性与碎片化的作用

摘要

在诸如Open X-Embodiment (OXE)等大规模数据集上训练的通才机器人策略，在广泛任务中展现出强劲性能。然而，这些策略往往难以超越其训练数据分布进行泛化。本文深入探讨了这种有限泛化能力背后的根本原因，指出捷径学习——即依赖任务无关特征——是阻碍泛化的关键因素。通过全面的理论与实证分析，我们揭示了导致捷径学习的两大主要因素：(1) 各子数据集内部多样性不足，以及(2) 子数据集间显著的分布差异，导致数据集碎片化。这些问题源于OXE等大规模数据集固有的结构，这类数据集通常由在多样环境与实体中独立收集的多个子数据集构成。我们的研究为减少捷径学习、提升通才机器人策略泛化能力的数据集收集策略提供了关键洞见。此外，在获取新的大规模数据不切实际的情况下，我们证明，精心挑选的机器人数据增强策略能有效减少现有离线数据集中的捷径学习，从而提升通才机器人策略（如pi_0）在仿真与真实环境中的泛化能力。更多信息请访问https://lucky-light-sun.github.io/proj/shortcut-learning-in-grps/。

English

Generalist robot policies trained on large-scale datasets such as Open X-Embodiment (OXE) demonstrate strong performance across a wide range of tasks. However, they often struggle to generalize beyond the distribution of their training data. In this paper, we investigate the underlying cause of this limited generalization capability. We identify shortcut learning -- the reliance on task-irrelevant features -- as a key impediment to generalization. Through comprehensive theoretical and empirical analysis, we uncover two primary contributors to shortcut learning: (1) limited diversity within individual sub-datasets, and (2) significant distributional disparities across sub-datasets, leading to dataset fragmentation. These issues arise from the inherent structure of large-scale datasets like OXE, which are typically composed of multiple sub-datasets collected independently across varied environments and embodiments. Our findings provide critical insights into dataset collection strategies that can reduce shortcut learning and enhance the generalization ability of generalist robot policies. Moreover, in scenarios where acquiring new large-scale data is impractical, we demonstrate that carefully selected robotic data augmentation strategies can effectively reduce shortcut learning in existing offline datasets, thereby improving generalization capabilities of generalist robot policies, e.g., pi_0, in both simulation and real-world environments. More information at https://lucky-light-sun.github.io/proj/shortcut-learning-in-grps/.

通用机器人策略中的捷径学习：数据集多样性与碎片化的作用

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation

摘要

Support