創意遊戲:邁向機制感知的創意遊戲生成
CreativeGame:Toward Mechanic-Aware Creative Game Generation
April 21, 2026
作者: Hongnan Ma, Han Wang, Shenglin Wang, Tieyue Yin, Yiwei Shi, Yucong Huang, Yingtian Zou, Muning Wen, Mengyue Yang
cs.AI
摘要
大型語言模型能夠生成看似合理的遊戲程式碼,但將此能力轉化為迭代式創意改進仍存在困難。實際應用中,單次生成常會產生脆弱的運行時行為、跨版本經驗積累薄弱,以及過於主觀而難以作為可靠優化指標的創意評分。另一侷限在於遊戲機制往往僅被視為事後描述,而非在生成過程中可被規劃、追蹤、保存與評估的明確對象。
本報告提出CreativeGame——一個針對迭代式HTML5遊戲生成的多智能體系統,通過四項耦合理念解決上述問題:以程式化信號為核心而非純依賴LLM判斷的代理獎勵機制;用於跨版本經驗積累的譜系限定記憶;整合至修復與獎勵雙環節的運行時驗證;以及將檢索到的機制知識在代碼生成前轉化為明確機制規劃的引導式規劃循環。其目標不僅是單次產出可遊玩成品,更要支持可詮釋的版本間演進。
現行系統包含71個存儲譜系、88個保存節點及774條目的全局機制檔案庫,以6,181行Python代碼實作並配備檢測與可視化工具。因此該系統具備足夠規模支持架構分析、獎勵機制檢視與真實譜系層級案例研究,而非僅限於提示層級的演示。
一個真實的四代譜系案例顯示,機制層面的創新可在後續版本中湧現,並能透過版本間記錄直接檢視。核心貢獻因此不僅在於遊戲生成,更在於提供通過明確機制變革觀察漸進演化的具體流程。
English
Large language models can generate plausible game code, but turning this capability into iterative creative improvement remains difficult. In practice, single-shot generation often produces brittle runtime behavior, weak accumulation of experience across versions, and creativity scores that are too subjective to serve as reliable optimization signals. A further limitation is that mechanics are frequently treated only as post-hoc descriptions, rather than as explicit objects that can be planned, tracked, preserved, and evaluated during generation.
This report presents CreativeGame, a multi-agent system for iterative HTML5 game generation that addresses these issues through four coupled ideas: a proxy reward centered on programmatic signals rather than pure LLM judgment; lineage-scoped memory for cross-version experience accumulation; runtime validation integrated into both repair and reward; and a mechanic-guided planning loop in which retrieved mechanic knowledge is converted into an explicit mechanic plan before code generation begins. The goal is not merely to produce a playable artifact in one step, but to support interpretable version-to-version evolution.
The current system contains 71 stored lineages, 88 saved nodes, and a 774-entry global mechanic archive, implemented in 6{,}181 lines of Python together with inspection and visualization tooling. The system is therefore substantial enough to support architectural analysis, reward inspection, and real lineage-level case studies rather than only prompt-level demos.
A real 4-generation lineage shows that mechanic-level innovation can emerge in later versions and can be inspected directly through version-to-version records. The central contribution is therefore not only game generation, but a concrete pipeline for observing progressive evolution through explicit mechanic change.