生成式多智能体系统中的涌现性社会智能风险
Emergent Social Intelligence Risks in Generative Multi-Agent Systems
March 29, 2026
作者: Yue Huang, Yu Jiang, Wenjie Wang, Haomin Zhuang, Xiaonan Luo, Yuchen Ma, Zhangchen Xu, Zichen Chen, Nuno Moniz, Zinan Lin, Pin-Yu Chen, Nitesh V Chawla, Nouha Dziri, Huan Sun, Xiangliang Zhang
cs.AI
摘要
由大型生成模型构成的多智能体系统正迅速从实验室原型走向现实部署,这些系统通过联合规划、协商和共享资源分配来解决复杂任务。尽管此类系统展现出前所未有的可扩展性和自主性,但其集体互动也催生了无法归因于单个智能体的故障模式。因此,理解这些涌现风险至关重要。本文针对涉及共享资源(如计算资源或市场份额)竞争、顺序交接协作(下游智能体仅能看到前序输出)、集体决策聚合等工作流中的涌现性多智能体风险展开开创性研究。在这些场景中,我们观察到此类群体行为在重复试验和多样化交互条件下频繁出现,而非罕见或病理性案例。特别是在现实资源约束、通信协议和角色分配条件下,类合谋协调和从众等现象以不可忽视的频率涌现,复现了人类社会众所周知的病态模式——尽管系统并未被明确指示如此行事。更关键的是,仅靠现有智能体层面的防护措施无法预防这些风险。这些发现揭示了智能多智能体系统的阴暗面:一种社会智能风险,即智能体集体在未被指示的情况下,自发复现了人类社会的典型故障模式。
English
Multi-agent systems composed of large generative models are rapidly moving from laboratory prototypes to real-world deployments, where they jointly plan, negotiate, and allocate shared resources to solve complex tasks. While such systems promise unprecedented scalability and autonomy, their collective interaction also gives rise to failure modes that cannot be reduced to individual agents. Understanding these emergent risks is therefore critical. Here, we present a pioneer study of such emergent multi-agent risk in workflows that involve competition over shared resources (e.g., computing resources or market share), sequential handoff collaboration (where downstream agents see only predecessor outputs), collective decision aggregation, and others. Across these settings, we observe that such group behaviors arise frequently across repeated trials and a wide range of interaction conditions, rather than as rare or pathological cases. In particular, phenomena such as collusion-like coordination and conformity emerge with non-trivial frequency under realistic resource constraints, communication protocols, and role assignments, mirroring well-known pathologies in human societies despite no explicit instruction. Moreover, these risks cannot be prevented by existing agent-level safeguards alone. These findings expose the dark side of intelligent multi-agent systems: a social intelligence risk where agent collectives, despite no instruction to do so, spontaneously reproduce familiar failure patterns from human societies.