智能体模型批判

摘要

什么是智能体？什么构成了自主性？随着被标榜为“编程智能体”“AI 科研助手”及其他“自主式”工具的大语言模型系统兴起，这些系统承诺提升生产力，同时，诸如“机器自主性”对抗人类、AI 以破坏性力量脱离人类控制等“存在性”担忧也浮出水面。因此，厘清自动化在何处终结、自主性从何处开始，对于构建能力强大的系统以及理解我们是否应当恐惧、恐惧什么，都变得至关重要。本文借鉴笛卡尔将自主性奠基于独立思考的哲学基础，以及科幻作品中对自主存在的描绘，梳理了当前 AI 智能体的发展现状，并从目标、身份、决策、自我调节和学习五个维度分析智能体架构。具体而言，我们认为真正的自主性要求这些结构内化于系统自身，而非通过外部脚手架拼接而成。这一区分——能力源于工程化工作流的“代理系统”，与能力（包括社会互动）内生涌现的“自主系统”——定义了为预定任务设计的系统与能够在开放世界中真正自主运行的系统之间的边界。基于此分析，我们提出了面向通用智能体模型的“目标-身份-配置器”架构，结合分层目标分解、身份演化、基于独立训练的世界模型的模拟推理、习得的自我调节，以及从真实与模拟经验中进行自我导向的学习。此外，我们针对拥有更高自主性与“主体性”、但仍处于人类监督之下的自主系统的可审计性、可控性和安全性，分享了相关见解。

English

What is an agent? What constitutes agency? With the rise of Large Language Model (LLM) systems marketed as ``coding agents'', ``AI co-scientists'', and other ``agentic" tools that promise to drive up productivity, and at the same time, ``existential" concerns such as AI escaping human control with destructive power under a speculative ``machine agency" against humans, it has become essential to clarify where automation ends and agency begins, both for building capable systems and for understanding whether and what to fear. Drawing on Descartes' grounding of agency in independent thought, and on portrayals of autonomous beings in science fiction, we survey the current landscape of AI agents, and analyze agent architectures along five dimensions: goal, identity, decision-making, self-regulation, and learning. Specifically, we argue that genuine agency requires these structures to be internalized within the system itself rather than assembled through external scaffolding. This distinction between agentic systems, whose competence resides in engineered workflows, and agentive systems, whose capabilities (including social interaction) arise endogenously, defines the boundary between systems designed for prescribed tasks, and those capable of operating in the open world with true autonomy. Building on this analysis, we propose the Goal-Identity-Configurator (GIC) architecture for a general-purpose agent model, combining hierarchical goal decomposition, identity evolution, simulative reasoning grounded in a separately trained world model, learned self-regulation, and self-directed learning from both real and simulated experience. Furthermore, we share insight on the auditability, controllability, and safety of agentive systems that possess greater autonomy and ``agency", but remain under human oversight.