エージェントモデル批判

要旨

エージェントとは何か。主体性（エージェンシー）は何によって構成されるのか。「コーディングエージェント」「AI共同研究者」、その他「エージェンティック」なツールとして販売され、生産性向上を約束する大規模言語モデル（LLM）システムの台頭、そして同時に、人間の制御を超え破壊力を持つAIが、推測上の「機械エージェンシー」によって人間に対抗するといった「実存的」懸念が高まるなか、自動化と主体性の境界を明確にすることは、能力あるシステムを構築するためにも、何を、またそもそも恐れるべきかを理解するためにも不可欠となっている。本稿では、主体性を独立した思考に基礎づけたデカルトの議論や、SF作品における自律的存在の描写を参考にしながら、現在のAIエージェントの状況を概観し、エージェントアーキテクチャを「目標」「アイデンティティ」「意思決定」「自己調整」「学習」の5つの次元に沿って分析する。具体的には、真の主体性にはこれらの構造が外部の足場を通じて組み立てられるのではなく、システム自体に内面化される必要があると論じる。能力が工学的なワークフローに依存するエージェント的（agentic）システムと、（社会的相互作用を含む）能力が内生的に生じるエージェンティブ（agentive）システムとの区別は、所定のタスク向けに設計されたシステムと、真の自律性をもって開かれた世界で動作可能なシステムとの境界を定義する。この分析に基づき、汎用エージェントモデルとして、目標-アイデンティティ-コンフィギュレータ（GIC）アーキテクチャを提案する。これは、階層的な目標分解、アイデンティティの進化、別途学習された世界モデルに基づくシミュレーション推論、学習された自己調整、そして実体験およびシミュレーション体験の両方からの自己主導的な学習を組み合わせるものである。さらに、より大きな自律性と「主体性」を持ちながらも人間の監督下にあるエージェンティブシステムの、監査可能性、制御可能性、安全性に関する知見を共有する。

English

What is an agent? What constitutes agency? With the rise of Large Language Model (LLM) systems marketed as ``coding agents'', ``AI co-scientists'', and other ``agentic" tools that promise to drive up productivity, and at the same time, ``existential" concerns such as AI escaping human control with destructive power under a speculative ``machine agency" against humans, it has become essential to clarify where automation ends and agency begins, both for building capable systems and for understanding whether and what to fear. Drawing on Descartes' grounding of agency in independent thought, and on portrayals of autonomous beings in science fiction, we survey the current landscape of AI agents, and analyze agent architectures along five dimensions: goal, identity, decision-making, self-regulation, and learning. Specifically, we argue that genuine agency requires these structures to be internalized within the system itself rather than assembled through external scaffolding. This distinction between agentic systems, whose competence resides in engineered workflows, and agentive systems, whose capabilities (including social interaction) arise endogenously, defines the boundary between systems designed for prescribed tasks, and those capable of operating in the open world with true autonomy. Building on this analysis, we propose the Goal-Identity-Configurator (GIC) architecture for a general-purpose agent model, combining hierarchical goal decomposition, identity evolution, simulative reasoning grounded in a separately trained world model, learned self-regulation, and self-directed learning from both real and simulated experience. Furthermore, we share insight on the auditability, controllability, and safety of agentive systems that possess greater autonomy and ``agency", but remain under human oversight.