ChatPaper.aiChatPaper

X-Sim:通过实境-模拟-实境的跨具身学习

X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real

May 11, 2025
作者: Prithwish Dan, Kushal Kedia, Angela Chao, Edward Weiyi Duan, Maximus Adrian Pace, Wei-Chiu Ma, Sanjiban Choudhury
cs.AI

摘要

人类视频为训练机器人操作策略提供了一种可扩展的方式,但缺乏标准模仿学习算法所需的动作标签。现有的跨实体映射方法试图将人类动作转化为机器人动作,但在实体差异显著时往往失效。我们提出了X-Sim,一个从真实到模拟再到真实的框架,它利用物体运动作为密集且可迁移的信号来学习机器人策略。X-Sim首先从RGBD人类视频中重建出逼真的模拟环境,并追踪物体轨迹以定义以物体为中心的奖励。这些奖励用于在模拟环境中训练强化学习(RL)策略。随后,通过使用不同视角和光照渲染的合成数据,将学习到的策略蒸馏为基于图像条件的扩散策略。为了迁移到现实世界,X-Sim引入了一种在线域适应技术,在部署过程中对齐真实与模拟的观测。重要的是,X-Sim不需要任何机器人遥操作数据。我们在2个环境中的5个操作任务上对其进行了评估,结果表明:(1) 相比手动追踪和模拟到真实基线,平均任务进度提高了30%;(2) 在数据收集时间减少10倍的情况下,与行为克隆效果相当;(3) 能够泛化到新的相机视角和测试时的变化。代码和视频可在https://portal-cornell.github.io/X-Sim/获取。
English
Human videos offer a scalable way to train robot manipulation policies, but lack the action labels needed by standard imitation learning algorithms. Existing cross-embodiment approaches try to map human motion to robot actions, but often fail when the embodiments differ significantly. We propose X-Sim, a real-to-sim-to-real framework that uses object motion as a dense and transferable signal for learning robot policies. X-Sim starts by reconstructing a photorealistic simulation from an RGBD human video and tracking object trajectories to define object-centric rewards. These rewards are used to train a reinforcement learning (RL) policy in simulation. The learned policy is then distilled into an image-conditioned diffusion policy using synthetic rollouts rendered with varied viewpoints and lighting. To transfer to the real world, X-Sim introduces an online domain adaptation technique that aligns real and simulated observations during deployment. Importantly, X-Sim does not require any robot teleoperation data. We evaluate it across 5 manipulation tasks in 2 environments and show that it: (1) improves task progress by 30% on average over hand-tracking and sim-to-real baselines, (2) matches behavior cloning with 10x less data collection time, and (3) generalizes to new camera viewpoints and test-time changes. Code and videos are available at https://portal-cornell.github.io/X-Sim/.
PDF52May 16, 2025