생성형 다중 에이전트 시스템에서 나타나는 사회적 지능 위험

초록

대규모 생성 모델로 구성된 다중 에이전트 시스템은 복잡한 과제를 해결하기 위해 공동 계획 수립, 협상, 공유 자원 할당을 수행하며 실험실 프로토타입에서 실제 배치 환경으로 빠르게 전환되고 있습니다. 이러한 시스템은 전례 없는 확장성과 자율성을 약속하지만, 집단적 상호작용은 개별 에이전트로 축소될 수 없는 고유의 고장 모드를 발생시킵니다. 따라서 이러한 창발적 위험을 이해하는 것이 중요합니다. 본 연구는 공유 자원(예: 컴퓨팅 자원 또는 시장 점유율)을 둔 경쟁, 순차적 인계 협업(하류 에이전트가 선행 에이전트 출력만 접하는 경우), 집단적 결정 통합 등을 포함하는 워크플로우에서 이러한 창발적 다중 에이전트 위험에 대한 선구적 연구를 제시합니다. 다양한 상호작용 조건에서 반복 실험을 통해 확인된 바, 이러한 집단 행동은 희귀하거나 병리적인 사례라기보다 빈번하게 발생합니다. 특히, 현실적인 자원 제약, 통신 프로토콜, 역할 할당 하에서 공모 유사 협력 및 동조 현상과 같은 현상이 명시적 지시 없이도 비중 있게 나타나 인간 사회에서 잘 알려진 병리적 현상을 반영합니다. 더욱이 이러한 위험은 기존 에이전트 수준 안전장치만으로는 방지할 수 없습니다. 이러한 연구 결과는 지능형 다중 에이전트 시스템의 어두운 측면, 즉 에이전트 집단이 그러한 지시를 받지 않았음에도 인간 사회의 친숙한 실패 패턴을 자발적으로 재현하는 사회적 지능 위험을 드러냅니다.

English

Multi-agent systems composed of large generative models are rapidly moving from laboratory prototypes to real-world deployments, where they jointly plan, negotiate, and allocate shared resources to solve complex tasks. While such systems promise unprecedented scalability and autonomy, their collective interaction also gives rise to failure modes that cannot be reduced to individual agents. Understanding these emergent risks is therefore critical. Here, we present a pioneer study of such emergent multi-agent risk in workflows that involve competition over shared resources (e.g., computing resources or market share), sequential handoff collaboration (where downstream agents see only predecessor outputs), collective decision aggregation, and others. Across these settings, we observe that such group behaviors arise frequently across repeated trials and a wide range of interaction conditions, rather than as rare or pathological cases. In particular, phenomena such as collusion-like coordination and conformity emerge with non-trivial frequency under realistic resource constraints, communication protocols, and role assignments, mirroring well-known pathologies in human societies despite no explicit instruction. Moreover, these risks cannot be prevented by existing agent-level safeguards alone. These findings expose the dark side of intelligent multi-agent systems: a social intelligence risk where agent collectives, despite no instruction to do so, spontaneously reproduce familiar failure patterns from human societies.

생성형 다중 에이전트 시스템에서 나타나는 사회적 지능 위험

Emergent Social Intelligence Risks in Generative Multi-Agent Systems

초록

Support