AI 代理行为科学
AI Agent Behavioral Science
June 4, 2025
作者: Lin Chen, Yunke Zhang, Jie Feng, Haoye Chai, Honglin Zhang, Bingbing Fan, Yibo Ma, Shiyuan Zhang, Nian Li, Tianhui Liu, Nicholas Sukiennik, Keyu Zhao, Yu Li, Ziyi Liu, Fengli Xu, Yong Li
cs.AI
摘要
近期大型语言模型(LLM)的进展推动了AI代理的发展,这些代理在多样化、互动性强且开放式的场景中展现出愈发拟人的行为,包括规划、适应及社交动态。这些行为并非仅源于底层模型的内部架构,而是源自其融入特定情境下的代理系统,其中环境因素、社交信号及互动反馈随时间塑造行为。这一演变催生了一门新的科学视角:AI代理行为科学。该视角不仅关注内部机制,更强调系统性地观察行为、设计干预措施以验证假设,以及基于理论解释AI代理如何随时间行动、适应与互动。我们整合了涉及单个代理、多代理及人机交互场景的日益增长的研究成果,并进一步展示了这一视角如何通过将公平性、安全性、可解释性、责任性与隐私视为行为属性,为负责任AI提供指导。通过统一最新发现并规划未来方向,我们将AI代理行为科学定位为传统以模型为中心方法的必要补充,为理解、评估及治理日益自主的AI系统在现实世界中的行为提供了关键工具。
English
Recent advances in large language models (LLMs) have enabled the development
of AI agents that exhibit increasingly human-like behaviors, including
planning, adaptation, and social dynamics across diverse, interactive, and
open-ended scenarios. These behaviors are not solely the product of the
internal architectures of the underlying models, but emerge from their
integration into agentic systems operating within specific contexts, where
environmental factors, social cues, and interaction feedbacks shape behavior
over time. This evolution necessitates a new scientific perspective: AI Agent
Behavioral Science. Rather than focusing only on internal mechanisms, this
perspective emphasizes the systematic observation of behavior, design of
interventions to test hypotheses, and theory-guided interpretation of how AI
agents act, adapt, and interact over time. We systematize a growing body of
research across individual agent, multi-agent, and human-agent interaction
settings, and further demonstrate how this perspective informs responsible AI
by treating fairness, safety, interpretability, accountability, and privacy as
behavioral properties. By unifying recent findings and laying out future
directions, we position AI Agent Behavioral Science as a necessary complement
to traditional model-centric approaches, providing essential tools for
understanding, evaluating, and governing the real-world behavior of
increasingly autonomous AI systems.