ChatPaper.aiChatPaper

SIMA 2:虚拟世界通用具身智能体

SIMA 2: A Generalist Embodied Agent for Virtual Worlds

December 4, 2025
作者: SIMA team, Adrian Bolton, Alexander Lerchner, Alexandra Cordell, Alexandre Moufarek, Andrew Bolt, Andrew Lampinen, Anna Mitenkova, Arne Olav Hallingstad, Bojan Vujatovic, Bonnie Li, Cong Lu, Daan Wierstra, Daniel P. Sawyer, Daniel Slater, David Reichert, Davide Vercelli, Demis Hassabis, Drew A. Hudson, Duncan Williams, Ed Hirst, Fabio Pardo, Felix Hill, Frederic Besse, Hannah Openshaw, Harris Chan, Hubert Soyer, Jane X. Wang, Jeff Clune, John Agapiou, John Reid, Joseph Marino, Junkyung Kim, Karol Gregor, Kaustubh Sridhar, Kay McKinney, Laura Kampis, Lei M. Zhang, Loic Matthey, Luyu Wang, Maria Abi Raad, Maria Loks-Thompson, Martin Engelcke, Matija Kecman, Matthew Jackson, Maxime Gazeau, Ollie Purkiss, Oscar Knagg, Peter Stys, Piermaria Mendolicchio, Raia Hadsell, Rosemary Ke, Ryan Faulkner, Sarah Chakera, Satinder Singh Baveja, Shane Legg, Sheleem Kashem, Tayfun Terzi, Thomas Keck, Tim Harley, Tim Scholtes, Tyson Roberts, Volodymyr Mnih, Yulan Liu, Zhengdong Wang, Zoubin Ghahramani
cs.AI

摘要

我们推出SIMA 2——一个能够理解并广泛作用于各类3D虚拟世界的通用具身智能体。基于Gemini基础模型构建的SIMA 2,标志着在具身环境中实现主动目标导向交互的重要突破。与早期仅能响应简单语言指令的研究(如SIMA 1)不同,SIMA 2可作为交互伙伴进行高级目标推理、与用户对话,并能处理通过语言和图像输入的复杂指令。在多样化游戏组合测试中,SIMA 2大幅缩小了与人类表现的差距,展现出对未知环境的强大泛化能力,同时保持了基础模型的核心推理特性。此外,该智能体具备开放式自我提升能力:通过利用Gemini生成任务并提供奖励,SIMA 2能在全新环境中从零开始自主学习新技能。这项研究为创建适用于虚拟乃至最终物理世界的通用型持续学习智能体验证了可行路径。
English
We introduce SIMA 2, a generalist embodied agent that understands and acts in a wide variety of 3D virtual worlds. Built upon a Gemini foundation model, SIMA 2 represents a significant step toward active, goal-directed interaction within an embodied environment. Unlike prior work (e.g., SIMA 1) limited to simple language commands, SIMA 2 acts as an interactive partner, capable of reasoning about high-level goals, conversing with the user, and handling complex instructions given through language and images. Across a diverse portfolio of games, SIMA 2 substantially closes the gap with human performance and demonstrates robust generalization to previously unseen environments, all while retaining the base model's core reasoning capabilities. Furthermore, we demonstrate a capacity for open-ended self-improvement: by leveraging Gemini to generate tasks and provide rewards, SIMA 2 can autonomously learn new skills from scratch in a new environment. This work validates a path toward creating versatile and continuously learning agents for both virtual and, eventually, physical worlds.
PDF91December 6, 2025