다윈 괴델 머신: 자기 개선 에이전트의 무한 진화

초록

오늘날의 AI 시스템은 인간이 설계한 고정된 아키텍처를 가지고 있으며, 자율적이고 지속적으로 스스로를 개선할 수 없다. AI의 발전은 그 자체로 자동화될 수 있다. 이를 안전하게 수행한다면, AI 개발을 가속화하고 그 혜택을 훨씬 더 빨리 누릴 수 있을 것이다. 메타러닝은 새로운 알고리즘의 발견을 자동화할 수 있지만, 1차적 개선과 적절한 탐색 공간의 인간 설계에 의해 제한된다. 괴델 머신은 이론적으로 대안을 제안했다: 스스로를 반복적으로 개선하며 증명 가능한 방식으로 유익하게 수정하는 AI이다. 불행히도, 대부분의 변경이 순이익임을 증명하는 것은 실제로 불가능하다. 우리는 다윈 괴델 머신(DGM)을 소개한다. 이는 스스로의 코드를 반복적으로 수정하며(따라서 코드베이스를 수정하는 능력도 개선), 각 변경을 코딩 벤치마크를 통해 경험적으로 검증하는 자기 개선 시스템이다. 다윈의 진화론과 개방형 연구에서 영감을 받은 DGM은 생성된 코딩 에이전트의 아카이브를 유지한다. 이 아카이브는 아카이브에서 에이전트를 샘플링하고, 이를 기반으로 새로운 흥미로운 버전의 에이전트를 생성하기 위해 기초 모델을 사용함으로써 성장한다. 이 개방형 탐색은 다양하고 고품질의 에이전트로 구성된 성장하는 트리를 형성하며, 탐색 공간을 통해 다양한 경로를 병렬적으로 탐색할 수 있게 한다. 경험적으로, DGM은 코딩 능력을 자동으로 개선하며(예: 더 나은 코드 편집 도구, 긴 컨텍스트 창 관리, 동료 검토 메커니즘), SWE-bench에서의 성능을 20.0%에서 50.0%로, Polyglot에서의 성능을 14.2%에서 30.7%로 증가시켰다. 또한, DGM은 자기 개선이나 개방형 탐색이 없는 기준선을 크게 능가한다. 모든 실험은 안전 조치(예: 샌드박싱, 인간 감독) 하에 수행되었다. DGM은 끝없는 혁신으로 펼쳐지는 경로를 따라 스스로의 디딤돌을 모을 수 있는 자기 개선 AI로 나아가는 중요한 단계이다.

English

Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a suitable search space. The G\"odel machine proposed a theoretical alternative: a self-improving AI that repeatedly modifies itself in a provably beneficial manner. Unfortunately, proving that most changes are net beneficial is impossible in practice. We introduce the Darwin G\"odel Machine (DGM), a self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks. Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), increasing performance on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). The DGM is a significant step toward self-improving AI, capable of gathering its own stepping stones along paths that unfold into endless innovation.

다윈 괴델 머신: 자기 개선 에이전트의 무한 진화

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

초록

Support