OpenGame:面向游戏的开放式智能体编程
OpenGame: Open Agentic Coding for Games
April 20, 2026
作者: Yilei Jiang, Jinyuan Hu, Qianyin Xiao, Yaozhi Zheng, Ruize Ma, Kaituo Feng, Jiaming Han, Tianshuo Peng, Kaixuan Fan, Manyuan Zhang, Xiangyu Yue
cs.AI
摘要
游戏开发处于创意设计与复杂软件工程的交汇点,需要协同调度游戏引擎、实时循环系统以及跨多个文件的紧耦合状态。虽然大语言模型和代码智能体现在能轻松解决孤立的编程任务,但在根据高层设计生成完整可玩游戏时,它们总会因跨文件不一致、场景连接断裂和逻辑混乱而溃败。我们通过OpenGame填补这一空白——首个专为端到端网页游戏创作设计的开源智能体框架。其核心是"游戏技能",一种可演进的重用能力:包含通过经验积累项目骨架库的模板技能,以及维护已验证修复方案动态协议的调试技能。二者协同使智能体能构建稳定架构并系统修复集成错误,而非仅修补孤立语法问题。支撑该框架的是GameCoder-27B,这个专精游戏引擎的代码大模型通过持续预训练、监督微调和执行导向的强化学习三阶段流程打造。由于验证交互可玩性本质上比检查静态代码更困难,我们进一步推出OpenGame-Bench评估流程,通过无头浏览器执行和视觉语言模型评判,从构建健康度、视觉可用性和意图对齐度三个维度对智能体游戏生成进行评分。在150个多样化游戏提示的测试中,OpenGame确立了全新标杆。我们希望OpenGame能推动代码智能体突破离散的软件工程问题,迈向构建复杂的交互式现实应用。我们的框架将完全开源。
English
Game development sits at the intersection of creative design and intricate software engineering, demanding the joint orchestration of game engines, real-time loops, and tightly coupled state across many files. While Large Language Models (LLMs) and code agents now solve isolated programming tasks with ease, they consistently stumble when asked to produce a fully playable game from a high-level design, collapsing under cross-file inconsistencies, broken scene wiring, and logical incoherence. We bridge this gap with OpenGame, the first open-source agentic framework explicitly designed for end-to-end web game creation. At its core lies Game Skill, a reusable, evolving capability composed of a Template Skill that grows a library of project skeletons from experience and a Debug Skill that maintains a living protocol of verified fixes - together enabling the agent to scaffold stable architectures and systematically repair integration errors rather than patch isolated syntax bugs. Powering this framework is GameCoder-27B, a code LLM specialized for game engine mastery through a three-stage pipeline of continual pre-training, supervised fine-tuning, and execution-grounded reinforcement learning. Since verifying interactive playability is fundamentally harder than checking static code, we further introduce OpenGame-Bench, an evaluation pipeline that scores agentic game generation along Build Health, Visual Usability, and Intent Alignment via headless browser execution and VLM judging. Across 150 diverse game prompts, OpenGame establishes a new state-of-the-art. We hope OpenGame pushes code agents beyond discrete software engineering problems and toward building complex, interactive real-world applications. Our framework will be fully open-sourced.