生成式多智能體系統中的社會智能湧現風險
Emergent Social Intelligence Risks in Generative Multi-Agent Systems
March 29, 2026
作者: Yue Huang, Yu Jiang, Wenjie Wang, Haomin Zhuang, Xiaonan Luo, Yuchen Ma, Zhangchen Xu, Zichen Chen, Nuno Moniz, Zinan Lin, Pin-Yu Chen, Nitesh V Chawla, Nouha Dziri, Huan Sun, Xiangliang Zhang
cs.AI
摘要
由大型生成模型構建的多智能體系統正迅速從實驗室原型邁向實際部署,這些系統通過聯合規劃、協商與共享資源分配來解決複雜任務。儘管此類系統預示著前所未有的可擴展性與自主性,但其集體互動也催生了無法歸因於單個智能體的故障模式。因此理解這些湧現風險至關重要。本文針對涉及共享資源(如計算資源或市場份額)競爭、序列式交接協作(下游智能體僅能查看前序輸出)、集體決策聚合等場景中的湧現性多智能體風險開展先驅研究。我們發現,這類群體行為在重複試驗和多種交互條件下頻繁出現,而非罕見或病態案例。特別是在現實的資源約束、通信協議和角色分配下,類壟斷式協同與從眾效應等現象以不可忽視的頻率湧現,儘管未經明確指令,卻複現了人類社會中廣為人知的病態模式。更關鍵的是,僅靠現有的智能體層級防護機制無法阻遏這些風險。這些發現揭示了智能多智能體系統的陰暗面:一種社會智能風險——即便未受指令驅動,智能體集群仍會自發重演人類社會中熟悉的失敗模式。
English
Multi-agent systems composed of large generative models are rapidly moving from laboratory prototypes to real-world deployments, where they jointly plan, negotiate, and allocate shared resources to solve complex tasks. While such systems promise unprecedented scalability and autonomy, their collective interaction also gives rise to failure modes that cannot be reduced to individual agents. Understanding these emergent risks is therefore critical. Here, we present a pioneer study of such emergent multi-agent risk in workflows that involve competition over shared resources (e.g., computing resources or market share), sequential handoff collaboration (where downstream agents see only predecessor outputs), collective decision aggregation, and others. Across these settings, we observe that such group behaviors arise frequently across repeated trials and a wide range of interaction conditions, rather than as rare or pathological cases. In particular, phenomena such as collusion-like coordination and conformity emerge with non-trivial frequency under realistic resource constraints, communication protocols, and role assignments, mirroring well-known pathologies in human societies despite no explicit instruction. Moreover, these risks cannot be prevented by existing agent-level safeguards alone. These findings expose the dark side of intelligent multi-agent systems: a social intelligence risk where agent collectives, despite no instruction to do so, spontaneously reproduce familiar failure patterns from human societies.