超智慧體 - 論文詳情

摘要

自我改進式人工智慧系統旨在透過學習優化自身的學習與問題解決流程，以降低對人類工程設計的依賴。現有的自我改進方法依賴於固定、人工設計的元層級機制，從根本上限制了此類系統的改進速度。達爾文·哥德爾機（DGM）透過反覆生成並評估自我修改的變體，展現了在程式編寫領域中開放式的自我改進能力。由於評估與自我修改均屬程式任務，程式能力的提升可直接轉化為自我改進能力的增強。然而，這種協同性在程式領域之外通常難以成立。我們提出超智慧體（hyperagents）——一種自我指涉的智慧體，將任務智慧體（解決目標任務）與元智慧體（修改自身及任務智慧體）整合為單一可編輯程式。關鍵在於，元層級的修改程序本身亦可編輯，從而實現元認知層面的自我改進，不僅優化任務解決行為，更能提升未來改進機制的生成能力。我們通過擴展DGM框架實例化此概念，創建DGM-超智慧體（DGM-H），摒棄了「任務表現與自我修改技能需具領域特定協同性」的假設，有望在任何可計算任務上實現自我加速進步。在多樣化領域中，DGM-H隨時間推移持續提升性能，其表現優於無自我改進或開放探索的基準系統，也超越先前的自我改進系統。更重要的是，DGM-H能改進其生成新智慧體的流程（如持久記憶、性能追蹤），這些元層級改進具跨領域遷移能力，並在多次運行中持續累積。DGM-超智慧體預示著開放式AI系統的雛形：它們不僅搜尋更優解，更持續優化「如何改進」的搜尋機制本身。

English

Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. Existing approaches to self-improvement rely on fixed, handcrafted meta-level mechanisms, fundamentally limiting how fast such systems can improve. The Darwin Gödel Machine (DGM) demonstrates open-ended self-improvement in coding by repeatedly generating and evaluating self-modified variants. Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability. However, this alignment does not generally hold beyond coding domains. We introduce hyperagents, self-referential agents that integrate a task agent (which solves the target task) and a meta agent (which modifies itself and the task agent) into a single editable program. Crucially, the meta-level modification procedure is itself editable, enabling metacognitive self-modification, improving not only the task-solving behavior, but also the mechanism that generates future improvements. We instantiate this framework by extending DGM to create DGM-Hyperagents (DGM-H), eliminating the assumption of domain-specific alignment between task performance and self-modification skill to potentially support self-accelerating progress on any computable task. Across diverse domains, the DGM-H improves performance over time and outperforms baselines without self-improvement or open-ended exploration, as well as prior self-improving systems. Furthermore, the DGM-H improves the process by which it generates new agents (e.g., persistent memory, performance tracking), and these meta-level improvements transfer across domains and accumulate across runs. DGM-Hyperagents offer a glimpse of open-ended AI systems that do not merely search for better solutions, but continually improve their search for how to improve.