EgoZero:基于智能眼镜的机器人学习
EgoZero: Robot Learning from Smart Glasses
May 26, 2025
作者: Vincent Liu, Ademi Adeniji, Haotian Zhan, Raunaq Bhirangi, Pieter Abbeel, Lerrel Pinto
cs.AI
摘要
尽管通用机器人技术近期取得了进展,但机器人策略在现实世界中的基本能力仍远不及人类。人类持续与物理世界互动,然而这一丰富的数据资源在机器人学习领域仍大多未被开发。我们提出了EgoZero,一个极简系统,它能够通过Project Aria智能眼镜捕捉的人类示范数据,在无需任何机器人数据的情况下,学习到稳健的操控策略。EgoZero实现了以下功能:(1) 从野外、以自我为中心的人类示范中提取完整且机器人可执行的动作;(2) 将人类视觉观察压缩为与形态无关的状态表示;(3) 进行闭环策略学习,该策略在形态、空间和语义上均具有泛化能力。我们将EgoZero策略部署于配备夹爪的Franka Panda机器人上,并在7项操控任务中展示了零样本迁移,成功率高达70%,每项任务仅需20分钟的数据收集。我们的研究结果表明,野外人类数据可作为现实世界机器人学习的可扩展基础,为机器人迈向丰富、多样且自然的训练数据未来铺平道路。代码与视频详见https://egozero-robot.github.io。
English
Despite recent progress in general purpose robotics, robot policies still lag
far behind basic human capabilities in the real world. Humans interact
constantly with the physical world, yet this rich data resource remains largely
untapped in robot learning. We propose EgoZero, a minimal system that learns
robust manipulation policies from human demonstrations captured with Project
Aria smart glasses, and zero robot data. EgoZero enables: (1)
extraction of complete, robot-executable actions from in-the-wild, egocentric,
human demonstrations, (2) compression of human visual observations into
morphology-agnostic state representations, and (3) closed-loop policy learning
that generalizes morphologically, spatially, and semantically. We deploy
EgoZero policies on a gripper Franka Panda robot and demonstrate zero-shot
transfer with 70% success rate over 7 manipulation tasks and only 20 minutes of
data collection per task. Our results suggest that in-the-wild human data can
serve as a scalable foundation for real-world robot learning - paving the way
toward a future of abundant, diverse, and naturalistic training data for
robots. Code and videos are available at https://egozero-robot.github.io.Summary
AI-Generated Summary