EgoZero：基於智慧眼鏡的機器人學習

摘要

尽管通用机器人技术近期取得了进展，但机器人在现实世界中的策略仍远未达到人类的基本能力水平。人类持续与物理世界互动，然而这一丰富的数据资源在机器人学习领域却大多未被充分利用。我们提出了EgoZero，一个极简系统，它能够通过Project Aria智能眼镜捕捉的人类示范数据，在无需任何机器人数据的情况下，学习到稳健的操控策略。EgoZero实现了以下功能：(1) 从野外、以自我为中心的人类示范中提取完整且机器人可执行的动作，(2) 将人类视觉观察压缩为与形态无关的状态表示，以及(3) 进行闭环策略学习，该策略在形态、空间和语义上均具备泛化能力。我们将EgoZero策略部署于Franka Panda夹爪机器人上，展示了在7项操控任务中零样本迁移的成功率高达70%，且每项任务仅需20分钟的数据收集。我们的研究结果表明，野外人类数据可作为现实世界机器人学习的可扩展基础——为机器人迈向拥有丰富、多样且自然训练数据的未来铺平道路。代码与视频可访问https://egozero-robot.github.io获取。

English

Despite recent progress in general purpose robotics, robot policies still lag far behind basic human capabilities in the real world. Humans interact constantly with the physical world, yet this rich data resource remains largely untapped in robot learning. We propose EgoZero, a minimal system that learns robust manipulation policies from human demonstrations captured with Project Aria smart glasses, and zero robot data. EgoZero enables: (1) extraction of complete, robot-executable actions from in-the-wild, egocentric, human demonstrations, (2) compression of human visual observations into morphology-agnostic state representations, and (3) closed-loop policy learning that generalizes morphologically, spatially, and semantically. We deploy EgoZero policies on a gripper Franka Panda robot and demonstrate zero-shot transfer with 70% success rate over 7 manipulation tasks and only 20 minutes of data collection per task. Our results suggest that in-the-wild human data can serve as a scalable foundation for real-world robot learning - paving the way toward a future of abundant, diverse, and naturalistic training data for robots. Code and videos are available at https://egozero-robot.github.io.

EgoZero：基於智慧眼鏡的機器人學習

EgoZero: Robot Learning from Smart Glasses

摘要

Support