AI 代理行为科学

摘要

近期大型语言模型（LLM）的进展推动了AI代理的发展，这些代理在多样化、互动性强且开放式的场景中展现出愈发拟人的行为，包括规划、适应及社交动态。这些行为并非仅源于底层模型的内部架构，而是源自其融入特定情境下的代理系统，其中环境因素、社交信号及互动反馈随时间塑造行为。这一演变催生了一门新的科学视角：AI代理行为科学。该视角不仅关注内部机制，更强调系统性地观察行为、设计干预措施以验证假设，以及基于理论解释AI代理如何随时间行动、适应与互动。我们整合了涉及单个代理、多代理及人机交互场景的日益增长的研究成果，并进一步展示了这一视角如何通过将公平性、安全性、可解释性、责任性与隐私视为行为属性，为负责任AI提供指导。通过统一最新发现并规划未来方向，我们将AI代理行为科学定位为传统以模型为中心方法的必要补充，为理解、评估及治理日益自主的AI系统在现实世界中的行为提供了关键工具。

English

Recent advances in large language models (LLMs) have enabled the development of AI agents that exhibit increasingly human-like behaviors, including planning, adaptation, and social dynamics across diverse, interactive, and open-ended scenarios. These behaviors are not solely the product of the internal architectures of the underlying models, but emerge from their integration into agentic systems operating within specific contexts, where environmental factors, social cues, and interaction feedbacks shape behavior over time. This evolution necessitates a new scientific perspective: AI Agent Behavioral Science. Rather than focusing only on internal mechanisms, this perspective emphasizes the systematic observation of behavior, design of interventions to test hypotheses, and theory-guided interpretation of how AI agents act, adapt, and interact over time. We systematize a growing body of research across individual agent, multi-agent, and human-agent interaction settings, and further demonstrate how this perspective informs responsible AI by treating fairness, safety, interpretability, accountability, and privacy as behavioral properties. By unifying recent findings and laying out future directions, we position AI Agent Behavioral Science as a necessary complement to traditional model-centric approaches, providing essential tools for understanding, evaluating, and governing the real-world behavior of increasingly autonomous AI systems.