超级智能体

摘要

自改进人工智能系统旨在通过让系统学习优化其学习与问题解决过程，从而减少对人类工程设计的依赖。现有的自改进方法依赖于固定的人工设计元层机制，这在根本上限制了系统改进的速度。达尔文·哥德尔机通过反复生成并评估自我修改的变体，在编程领域实现了开放式的自我改进。由于评估和自我修改都属于编程任务，编程能力的提升可直接转化为自改进能力的增强。然而，这种协同效应在编程领域之外通常难以成立。我们提出超智能体概念——这种自指代智能体将任务智能体（解决目标任务）与元智能体（修改自身及任务智能体）整合为单一可编辑程序。关键在于，元层修改过程本身也可编辑，从而实现元认知层面的自我修改，不仅能优化任务解决行为，还能改进生成未来优化的机制。我们通过扩展达尔文·哥德尔机创建DGM-超智能体来实例化该框架，摒弃了任务性能与自修改技能需具备领域特定协同性的假设，有望在任何可计算任务上实现自我加速进步。在多个领域中，DGM-H随时间推移持续提升性能，其表现优于无自改进或开放式探索的基线系统，也超越了先前的自改进系统。更重要的是，DGM-H能优化其生成新智能体的过程（如持久化记忆、性能追踪等），这些元层改进具有跨领域迁移性和跨任务累积性。DGM-超智能体展现了开放式AI系统的雏形，它们不仅搜索更优解，更持续优化"如何改进"的搜索机制本身。

English

Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. Existing approaches to self-improvement rely on fixed, handcrafted meta-level mechanisms, fundamentally limiting how fast such systems can improve. The Darwin Gödel Machine (DGM) demonstrates open-ended self-improvement in coding by repeatedly generating and evaluating self-modified variants. Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability. However, this alignment does not generally hold beyond coding domains. We introduce hyperagents, self-referential agents that integrate a task agent (which solves the target task) and a meta agent (which modifies itself and the task agent) into a single editable program. Crucially, the meta-level modification procedure is itself editable, enabling metacognitive self-modification, improving not only the task-solving behavior, but also the mechanism that generates future improvements. We instantiate this framework by extending DGM to create DGM-Hyperagents (DGM-H), eliminating the assumption of domain-specific alignment between task performance and self-modification skill to potentially support self-accelerating progress on any computable task. Across diverse domains, the DGM-H improves performance over time and outperforms baselines without self-improvement or open-ended exploration, as well as prior self-improving systems. Furthermore, the DGM-H improves the process by which it generates new agents (e.g., persistent memory, performance tracking), and these meta-level improvements transfer across domains and accumulate across runs. DGM-Hyperagents offer a glimpse of open-ended AI systems that do not merely search for better solutions, but continually improve their search for how to improve.