機器人學習的世界模型:一項全面綜述
World Model for Robot Learning: A Comprehensive Survey
April 30, 2026
作者: Bohan Hou, Gen Li, Jindou Jia, Tuo An, Xinying Guo, Sicong Leng, Haoran Geng, Yanjie Ze, Tatsuya Harada, Philip Torr, Oier Mees, Marc Pollefeys, Zhuang Liu, Jiajun Wu, Pieter Abbeel, Jitendra Malik, Yilun Du, Jianfei Yang
cs.AI
摘要
世界模型作为预测环境在行动下如何演化的表征,已成为机器人学习的核心组成部分。它们支持策略学习、规划、仿真、评估和数据生成,并随着基础模型和大规模视频生成的兴起而迅速发展。然而,现有文献在架构、功能角色以及具身应用领域方面仍呈现碎片化状态。为填补这一空白,我们从机器人学习的视角对世界模型进行了全面综述。我们探讨了世界模型如何与机器人策略耦合,如何作为学习型模拟器支持强化学习与评估,以及机器人视频世界模型如何从基于想象生成发展为可控、结构化和基础规模化的形式。我们进一步将这些思想与导航和自动驾驶联系起来,并总结了代表性数据集、基准测试和评估协议。总体而言,本综述系统回顾了机器人学习领域世界模型的快速增长的文献,阐明了关键范式与应用,并指出了具身代理预测建模的主要挑战与未来方向。为便于持续获取新兴工作、基准测试和资源,我们将与本综述同步维护并定期更新配套的 GitHub 代码库。
English
World models, which are predictive representations of how environments evolve under actions, have become a central component of robot learning. They support policy learning, planning, simulation, evaluation, data generation, and have advanced rapidly with the rise of foundation models and large-scale video generation. However, the literature remains fragmented across architectures, functional roles, and embodied application domains. To address this gap, we present a comprehensive review of world models from a robot-learning perspective. We examine how world models are coupled with robot policies, how they serve as learned simulators for reinforcement learning and evaluation, and how robotic video world models have progressed from imagination-based generation to controllable, structured, and foundation-scale formulations. We further connect these ideas to navigation and autonomous driving, and summarize representative datasets, benchmarks, and evaluation protocols. Overall, this survey systematically reviews the rapidly growing literature on world models for robot learning, clarifies key paradigms and applications, and highlights major challenges and future directions for predictive modeling in embodied agents. To facilitate continued access to newly emerging works, benchmarks, and resources, we will maintain and regularly update the accompanying GitHub repository alongside this survey.