ハイパーエージェント

要旨

自己改善AIシステムは、自身の学習と問題解決プロセスを改善する方法を学ぶことで、人間による設計への依存を軽減することを目指している。既存の自己改善アプローチは、固定的で手作りされたメタレベル機構に依存しており、そのようなシステムの改善速度を根本的に制限している。Darwin Gödel Machine（DGM）は、自己修正された変種を繰り返し生成・評価することで、コーディングにおけるオープンエンドな自己改善を実証している。評価と自己修正の両方がコーディング課題であるため、コーディング能力の向上が自己改善能力の向上に転換され得る。しかし、この整合性は一般にコーディング領域を超えては成立しない。我々は、ハイパーエージェントを提案する。これは、タスクエージェント（目標タスクを解決する）とメタエージェント（自身とタスクエージェントを修正する）を単一の編集可能なプログラムに統合した、自己言及的なエージェントである。決定的に、メタレベルでの修正手順自体が編集可能であり、メタ認知的自己修正を可能にする。これにより、タスク解決行動だけでなく、将来の改善を生み出すメカニズムそのものも改善される。我々はこの枠組みを具体化するため、DGMを拡張してDGM-Hyperagents（DGM-H）を創出した。これにより、タスク性能と自己修正技能の間の領域特異的な整合性という仮定を排除し、任意の計算可能タスクにおける自己加速的な進歩を潜在的に支援する。多様な領域にわたって、DGM-Hは時間とともに性能を向上させ、自己改善やオープンエンドな探索を行わないベースラインや、従来の自己改善システムを上回る性能を示す。さらに、DGM-Hは新しいエージェントを生成するプロセス（例：永続的メモリ、性能追跡）も改善し、これらのメタレベルの改善は領域間で転移し、実行を跨いで蓄積する。DGM-Hyperagentsは、単により良い解を探索するだけでなく、改善の方法そのものを探求する方法を絶えず改善し続ける、オープンエンドなAIシステムの一端を示唆している。

English

Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. Existing approaches to self-improvement rely on fixed, handcrafted meta-level mechanisms, fundamentally limiting how fast such systems can improve. The Darwin Gödel Machine (DGM) demonstrates open-ended self-improvement in coding by repeatedly generating and evaluating self-modified variants. Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability. However, this alignment does not generally hold beyond coding domains. We introduce hyperagents, self-referential agents that integrate a task agent (which solves the target task) and a meta agent (which modifies itself and the task agent) into a single editable program. Crucially, the meta-level modification procedure is itself editable, enabling metacognitive self-modification, improving not only the task-solving behavior, but also the mechanism that generates future improvements. We instantiate this framework by extending DGM to create DGM-Hyperagents (DGM-H), eliminating the assumption of domain-specific alignment between task performance and self-modification skill to potentially support self-accelerating progress on any computable task. Across diverse domains, the DGM-H improves performance over time and outperforms baselines without self-improvement or open-ended exploration, as well as prior self-improving systems. Furthermore, the DGM-H improves the process by which it generates new agents (e.g., persistent memory, performance tracking), and these meta-level improvements transfer across domains and accumulate across runs. DGM-Hyperagents offer a glimpse of open-ended AI systems that do not merely search for better solutions, but continually improve their search for how to improve.