對抗性數據收集：人機協作擾動實現高效且穩健的機器人模仿學習

摘要

在機器人操作領域，追求數據效率——即質量勝於數量——已成為一項核心原則，尤其是在現實世界數據收集成本高昂的背景下。我們提出，最大化單個示範的信息密度可以顯著減少對大規模數據集的依賴，同時提升任務表現。為此，我們引入了對抗性數據收集（Adversarial Data Collection, ADC），這是一種人機協同（Human-in-the-Loop, HiL）框架，通過實時、雙向的人與環境互動重新定義了機器人數據採集方式。與被動記錄靜態示範的傳統流程不同，ADC採用了一種協作擾動範式：在單次任務中，對抗性操作者動態改變物體狀態、環境條件和語言指令，而遠程操作者則自適應調整動作以應對這些不斷變化的挑戰。這一過程將多樣的失敗恢復行為、組合任務變體及環境擾動壓縮至最少的示範中。我們的實驗表明，經ADC訓練的模型在對未見任務指令的組合泛化能力、對感知擾動的魯棒性以及錯誤恢復能力的湧現方面均表現優異。引人注目的是，僅使用ADC收集的20%示範量訓練的模型，其性能顯著超越使用完整數據集的傳統方法。這些進展彌合了以數據為中心的學習範式與實際機器人部署之間的差距，證明了戰略性數據採集（而非僅僅事後處理）對於可擴展的現實世界機器人學習至關重要。此外，我們正在構建一個大規模的ADC-Robotics數據集，包含帶有對抗性擾動的真實世界操作任務。這一基準將開源，以促進機器人模仿學習的進步。

English

The pursuit of data efficiency, where quality outweighs quantity, has emerged as a cornerstone in robotic manipulation, especially given the high costs associated with real-world data collection. We propose that maximizing the informational density of individual demonstrations can dramatically reduce reliance on large-scale datasets while improving task performance. To this end, we introduce Adversarial Data Collection, a Human-in-the-Loop (HiL) framework that redefines robotic data acquisition through real-time, bidirectional human-environment interactions. Unlike conventional pipelines that passively record static demonstrations, ADC adopts a collaborative perturbation paradigm: during a single episode, an adversarial operator dynamically alters object states, environmental conditions, and linguistic commands, while the tele-operator adaptively adjusts actions to overcome these evolving challenges. This process compresses diverse failure-recovery behaviors, compositional task variations, and environmental perturbations into minimal demonstrations. Our experiments demonstrate that ADC-trained models achieve superior compositional generalization to unseen task instructions, enhanced robustness to perceptual perturbations, and emergent error recovery capabilities. Strikingly, models trained with merely 20% of the demonstration volume collected through ADC significantly outperform traditional approaches using full datasets. These advances bridge the gap between data-centric learning paradigms and practical robotic deployment, demonstrating that strategic data acquisition, not merely post-hoc processing, is critical for scalable, real-world robot learning. Additionally, we are curating a large-scale ADC-Robotics dataset comprising real-world manipulation tasks with adversarial perturbations. This benchmark will be open-sourced to facilitate advancements in robotic imitation learning.

對抗性數據收集：人機協作擾動實現高效且穩健的機器人模仿學習

Adversarial Data Collection: Human-Collaborative Perturbations for Efficient and Robust Robotic Imitation Learning

摘要

Support