達爾文哥德爾機：自我改進代理的開放式演化

摘要

當今的人工智慧系統具有人類設計的固定架構，無法自主且持續地自我改進。人工智慧的進步本身可以實現自動化。若能安全地進行，這將加速人工智慧的發展，使我們能更早地收穫其益處。元學習可以自動化新演算法的發現，但受到一階改進和人類設計的合適搜索空間的限制。哥德爾機提出了一種理論上的替代方案：一種自我改進的人工智慧，能夠以可證明有益的方式反覆修改自身。然而，在實踐中證明大多數變更具有淨效益是不可能的。我們引入了達爾文哥德爾機（DGM），這是一種自我改進系統，它迭代地修改自己的代碼（從而也提高了修改自身代碼庫的能力），並使用編碼基準對每次變更進行實證驗證。受達爾文進化和開放性研究的啟發，DGM維護了一個生成的編碼代理的檔案庫。它通過從檔案庫中採樣一個代理，並使用基礎模型創建一個新的、有趣的版本來擴展檔案庫。這種開放性探索形成了一棵不斷增長的多樣化、高質量代理樹，並允許並行探索搜索空間中的許多不同路徑。實證表明，DGM自動提高了其編碼能力（例如，更好的代碼編輯工具、長上下文窗口管理、同行評審機制），在SWE-bench上的性能從20.0%提高到50.0%，在Polyglot上從14.2%提高到30.7%。此外，DGM顯著優於沒有自我改進或開放性探索的基線。所有實驗均在安全措施下進行（例如，沙盒化、人類監督）。DGM是朝著自我改進人工智慧邁出的重要一步，能夠在展開無盡創新的路徑上收集自己的墊腳石。

English

Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a suitable search space. The G\"odel machine proposed a theoretical alternative: a self-improving AI that repeatedly modifies itself in a provably beneficial manner. Unfortunately, proving that most changes are net beneficial is impossible in practice. We introduce the Darwin G\"odel Machine (DGM), a self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks. Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), increasing performance on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). The DGM is a significant step toward self-improving AI, capable of gathering its own stepping stones along paths that unfold into endless innovation.

達爾文哥德爾機：自我改進代理的開放式演化

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

摘要

Support