达尔文·哥德尔机：自我改进智能体的开放式进化

摘要

当今的人工智能系统采用人类设计的固定架构，无法自主且持续地进行自我改进。人工智能的进步本身可以实现自动化。若能安全实施，这将加速AI的发展，使我们能更早地收获其带来的益处。元学习能够自动化地发现新算法，但受限于一阶改进及人类对合适搜索空间的设计。哥德尔机提出了一种理论上的替代方案：一种能够以可证明有益的方式反复自我修改的自我改进型AI。遗憾的是，在实践中证明大多数修改具有净效益是不可能的。我们引入了达尔文哥德尔机（DGM），这是一种自我改进系统，它迭代地修改自身代码（从而也提升了其修改自身代码库的能力），并通过编码基准对每次修改进行实证验证。受达尔文进化论和开放性研究启发，DGM维护了一个生成的编码代理档案库。它通过从档案库中采样一个代理，并利用基础模型创建该代理的一个新颖、有趣的版本来扩展档案库。这种开放式探索形成了一个不断增长的多样化、高质量代理树，允许并行探索搜索空间中的多条路径。实证表明，DGM自动提升了其编码能力（例如，更好的代码编辑工具、长上下文窗口管理、同行评审机制），在SWE-bench上的性能从20.0%提升至50.0%，在Polyglot上从14.2%提升至30.7%。此外，DGM显著优于不具备自我改进或开放式探索的基线模型。所有实验均在安全措施（如沙盒环境、人工监督）下进行。DGM是迈向自我改进型AI的重要一步，它能够沿着通向无尽创新的路径，自主积累其进步的基石。

English

Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a suitable search space. The G\"odel machine proposed a theoretical alternative: a self-improving AI that repeatedly modifies itself in a provably beneficial manner. Unfortunately, proving that most changes are net beneficial is impossible in practice. We introduce the Darwin G\"odel Machine (DGM), a self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks. Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), increasing performance on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). The DGM is a significant step toward self-improving AI, capable of gathering its own stepping stones along paths that unfold into endless innovation.

达尔文·哥德尔机：自我改进智能体的开放式进化

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

摘要

Support