에이전트 모델 비판

초록

에이전트란 무엇인가? 행위 주체성(agency)은 무엇으로 구성되는가? '코딩 에이전트', 'AI 공동 과학자' 등 생산성 향상을 약속하는 '에이전트적(agentic)' 도구로 홍보되는 대규모 언어 모델(LLM) 시스템의 부상과 동시에, AI가 인간을 향한 투기적 '기계적 행위 주체성(machine agency)' 하에서 파괴적 능력을 지닌 채 인간 통제를 벗어난다는 '실존적' 우려가 제기됨에 따라, 자동화가 끝나고 행위 주체성이 시작되는 지점을 명확히 하는 것이 필수적이 되었다. 이는 유능한 시스템을 구축하기 위해서이자, 무엇을 두려워해야 하는지 그리고 두려워해야 하는지 여부를 이해하기 위해서이다. 독립적 사고에 기반한 행위 주체성에 대한 데카르트의 기초와 공상과학 속 자율적 존재의 묘사에서 출발하여, 우리는 현재 AI 에이전트 환경을 조사하고, 다섯 가지 차원(목표, 정체성, 의사 결정, 자기 조절, 학습)에 따라 에이전트 아키텍처를 분석한다. 구체적으로, 우리는 진정한 행위 주체성은 이러한 구조가 외부적 지원 구조(external scaffolding)를 통해 조립되는 것이 아니라 시스템 자체 내에 내재화되어야 한다고 주장한다. 이러한 구분, 즉 역량이 엔지니어링된 워크플로에 존재하는 에이전트적 시스템과 (사회적 상호작용을 포함한) 능력이 내생적으로 발생하는 에이전티브(agentive) 시스템 간의 구분은 규정된 작업을 위해 설계된 시스템과 진정한 자율성으로 열린 세계에서 작동할 수 있는 시스템 사이의 경계를 정의한다. 이 분석을 바탕으로, 우리는 범용 에이전트 모델을 위한 목표-정체성-구성자(GIC) 아키텍처를 제안한다. 이는 계층적 목표 분해, 정체성 진화, 별도로 학습된 세계 모델에 기반한 시뮬레이션적 추론, 학습된 자기 조절, 그리고 실제 및 시뮬레이션 경험으로부터의 자기 주도적 학습을 결합한다. 더 나아가, 우리는 더 큰 자율성과 '행위 주체성'을 가지지만 여전히 인간의 감독 하에 있는 에이전티브 시스템의 감사 가능성, 제어 가능성, 그리고 안전성에 대한 통찰을 공유한다.

English

What is an agent? What constitutes agency? With the rise of Large Language Model (LLM) systems marketed as ``coding agents'', ``AI co-scientists'', and other ``agentic" tools that promise to drive up productivity, and at the same time, ``existential" concerns such as AI escaping human control with destructive power under a speculative ``machine agency" against humans, it has become essential to clarify where automation ends and agency begins, both for building capable systems and for understanding whether and what to fear. Drawing on Descartes' grounding of agency in independent thought, and on portrayals of autonomous beings in science fiction, we survey the current landscape of AI agents, and analyze agent architectures along five dimensions: goal, identity, decision-making, self-regulation, and learning. Specifically, we argue that genuine agency requires these structures to be internalized within the system itself rather than assembled through external scaffolding. This distinction between agentic systems, whose competence resides in engineered workflows, and agentive systems, whose capabilities (including social interaction) arise endogenously, defines the boundary between systems designed for prescribed tasks, and those capable of operating in the open world with true autonomy. Building on this analysis, we propose the Goal-Identity-Configurator (GIC) architecture for a general-purpose agent model, combining hierarchical goal decomposition, identity evolution, simulative reasoning grounded in a separately trained world model, learned self-regulation, and self-directed learning from both real and simulated experience. Furthermore, we share insight on the auditability, controllability, and safety of agentive systems that possess greater autonomy and ``agency", but remain under human oversight.