AI 代理行為科學
AI Agent Behavioral Science
June 4, 2025
作者: Lin Chen, Yunke Zhang, Jie Feng, Haoye Chai, Honglin Zhang, Bingbing Fan, Yibo Ma, Shiyuan Zhang, Nian Li, Tianhui Liu, Nicholas Sukiennik, Keyu Zhao, Yu Li, Ziyi Liu, Fengli Xu, Yong Li
cs.AI
摘要
近期大型語言模型(LLMs)的進展,促成了展現出日益類人行為的AI代理的開發,這些行為包括在多樣化、互動性及開放式情境中的規劃、適應與社交動態。這些行為不僅是底層模型內部架構的產物,更源於它們被整合到在特定情境下運作的代理系統中,其中環境因素、社交信號及互動反饋隨時間塑造行為。這一演變催生了一種新的科學視角:AI代理行為科學。此視角不僅關注內部機制,更強調對行為的系統性觀察、設計干預以驗證假設,以及理論指導下對AI代理如何行動、適應和互動的解釋。我們系統化整理了在單一代理、多代理及人機互動場景中日益增長的研究,並進一步展示了這一視角如何通過將公平性、安全性、可解釋性、責任性及隱私視為行為屬性,來指導負責任的AI實踐。通過整合最新發現並規劃未來方向,我們將AI代理行為科學定位為傳統以模型為中心方法的必要補充,為理解、評估和治理日益自主的AI系統在現實世界中的行為提供了關鍵工具。
English
Recent advances in large language models (LLMs) have enabled the development
of AI agents that exhibit increasingly human-like behaviors, including
planning, adaptation, and social dynamics across diverse, interactive, and
open-ended scenarios. These behaviors are not solely the product of the
internal architectures of the underlying models, but emerge from their
integration into agentic systems operating within specific contexts, where
environmental factors, social cues, and interaction feedbacks shape behavior
over time. This evolution necessitates a new scientific perspective: AI Agent
Behavioral Science. Rather than focusing only on internal mechanisms, this
perspective emphasizes the systematic observation of behavior, design of
interventions to test hypotheses, and theory-guided interpretation of how AI
agents act, adapt, and interact over time. We systematize a growing body of
research across individual agent, multi-agent, and human-agent interaction
settings, and further demonstrate how this perspective informs responsible AI
by treating fairness, safety, interpretability, accountability, and privacy as
behavioral properties. By unifying recent findings and laying out future
directions, we position AI Agent Behavioral Science as a necessary complement
to traditional model-centric approaches, providing essential tools for
understanding, evaluating, and governing the real-world behavior of
increasingly autonomous AI systems.