ChatPaper.aiChatPaper

机器人学习的世界模型:一项综合性综述

World Model for Robot Learning: A Comprehensive Survey

April 30, 2026
作者: Bohan Hou, Gen Li, Jindou Jia, Tuo An, Xinying Guo, Sicong Leng, Haoran Geng, Yanjie Ze, Tatsuya Harada, Philip Torr, Oier Mees, Marc Pollefeys, Zhuang Liu, Jiajun Wu, Pieter Abbeel, Jitendra Malik, Yilun Du, Jianfei Yang
cs.AI

摘要

世界模型作为对动作环境下环境演变的预测性表征,已成为机器人学习的核心组成部分。它不仅支持策略学习、规划、仿真、评估和数据生成,而且随着基础模型与大规模视频生成技术的兴起,取得了快速进展。然而,现有文献在架构、功能角色及具身体现应用领域方面仍较为零散。为填补这一空白,我们从机器人学习的视角对世界模型进行了全面综述。我们探讨了世界模型如何与机器人策略耦合,如何作为强化学习与评估的学习型仿真器发挥作用,以及机器人视频世界模型如何从基于想象的生成演变为可控、结构化且具有基础规模的形态。我们进一步将这些思想与导航和自动驾驶联系起来,并总结了具有代表性的数据集、基准和评估协议。总体而言,本综述系统梳理了机器人学习领域快速发展的世界模型文献,明确了关键范式与应用,并指出了具身智能体中预测建模的主要挑战与未来方向。为方便持续获取最新研究成果、基准和资源,我们将维护并定期更新与本文配套的GitHub仓库。
English
World models, which are predictive representations of how environments evolve under actions, have become a central component of robot learning. They support policy learning, planning, simulation, evaluation, data generation, and have advanced rapidly with the rise of foundation models and large-scale video generation. However, the literature remains fragmented across architectures, functional roles, and embodied application domains. To address this gap, we present a comprehensive review of world models from a robot-learning perspective. We examine how world models are coupled with robot policies, how they serve as learned simulators for reinforcement learning and evaluation, and how robotic video world models have progressed from imagination-based generation to controllable, structured, and foundation-scale formulations. We further connect these ideas to navigation and autonomous driving, and summarize representative datasets, benchmarks, and evaluation protocols. Overall, this survey systematically reviews the rapidly growing literature on world models for robot learning, clarifies key paradigms and applications, and highlights major challenges and future directions for predictive modeling in embodied agents. To facilitate continued access to newly emerging works, benchmarks, and resources, we will maintain and regularly update the accompanying GitHub repository alongside this survey.
PDF122May 14, 2026