SimWorld:物理与社会世界中自主智能体的开放式现实仿真平台
SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds
November 30, 2025
作者: Jiawei Ren, Yan Zhuang, Xiaokang Ye, Lingjun Mao, Xuhong He, Jianzhi Shen, Mrinaal Dogra, Yiming Liang, Ruixuan Zhang, Tianai Yue, Yiqing Yang, Eric Liu, Ryan Wu, Kevin Benavente, Rajiv Mandya Nagaraju, Muhammad Faayez, Xiyan Zhang, Dhruv Vivek Sharma, Xianrui Zhong, Ziqiao Ma, Tianmin Shu, Zhiting Hu, Lianhui Qin
cs.AI
摘要
虽然基于大语言模型/视觉语言模型的智能体在数学、编程和计算机操作领域发展迅猛,但其在复杂物理环境与社会场景中的应用仍面临挑战。要构建能在现实世界中生存发展(例如通过自主创收或经营企业)的智能体,需要在大规模具身场景中进行海量交互、推理、训练与评估。然而现有世界模拟器存在明显局限:往往依赖有限的手工构建环境,模拟简化的游戏式物理规则与社会逻辑,且缺乏对大语言模型/视觉语言模型智能体的原生支持。我们推出基于虚幻引擎5构建的新型模拟器SimWorld,专为在丰富逼真的拟真环境中开发与评估大语言模型/视觉语言模型智能体而设计。该平台具备三大核心功能:(1)逼真的开放式世界模拟,包括精确的物理社会动态及语言驱动的程序化环境生成;(2)面向智能体的丰富交互接口,支持多模态世界信息输入与多层级开放词汇动作;(3)可灵活定制的多样化物理社会推理场景。我们通过部署前沿大语言模型智能体(如GPT-4o、Gemini-2.5-Flash、Claude-3.5和DeepSeek-Prover-V2)在需战略协作与竞争的长周期多智能体配送任务中验证系统性能,结果揭示了不同模型独特的推理模式与局限。SimWorld已开源,期待其成为推动跨学科现实世界智能体研究的基础平台:https://simworld.org。
English
While LLM/VLM-powered AI agents have advanced rapidly in math, coding, and computer use, their applications in complex physical and social environments remain challenging. Building agents that can survive and thrive in the real world (for example, by autonomously earning income or running a business) requires massive-scale interaction, reasoning, training, and evaluation across diverse embodied scenarios. However, existing world simulators for such development fall short: they often rely on limited hand-crafted environments, simulate simplified game-like physics and social rules, and lack native support for LLM/VLM agents. We introduce SimWorld, a new simulator built on Unreal Engine 5, designed for developing and evaluating LLM/VLM agents in rich, real-world-like settings. SimWorld offers three core capabilities: (1) realistic, open-ended world simulation, including accurate physical and social dynamics and language-driven procedural environment generation; (2) a rich interface for LLM/VLM agents, with multimodal world inputs and open-vocabulary actions at varying levels of abstraction; and (3) diverse and extensible physical and social reasoning scenarios that are easily customizable by users. We demonstrate SimWorld by deploying frontier LLM agents (e.g., GPT-4o, Gemini-2.5-Flash, Claude-3.5, and DeepSeek-Prover-V2) on long-horizon multi-agent delivery tasks involving strategic cooperation and competition. The results reveal distinct reasoning patterns and limitations across models. We open-source SimWorld and hope it becomes a foundational platform for advancing real-world agent intelligence across disciplines: https://simworld.org.