AgentsNet:多智能体大语言模型中的协调与协作推理
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
July 11, 2025
作者: Florian Grötschla, Luis Müller, Jan Tönshoff, Mikhail Galkin, Bryan Perozzi
cs.AI
摘要
大型语言模型(LLMs)已展现出强大的问题解决能力,尤其是在多智能体系统中组织使用时。然而,这类系统的出现也引发了一系列关于复杂智能体网络能否有效自我组织与协作的问题。尽管在标准推理基准上的性能测量能反映多智能体系统解决推理任务的能力,但尚不清楚这些系统是否能有效利用其拓扑结构。为此,我们提出了AgentsNet,一个专为多智能体推理设计的新基准。借鉴分布式系统与图论中的经典问题,AgentsNet旨在评估多智能体系统在给定网络拓扑下,协作制定问题解决策略、自我组织及有效沟通的能力。我们在AgentsNet上评估了多种基线方法,包括首先需就组织与通信基本协议达成一致的同质智能体网络。研究发现,部分前沿LLMs在小规模网络中已表现出色,但随着网络规模扩大,其性能开始下降。现有多智能体基准最多涵盖2至5个智能体,而AgentsNet在规模上几乎无限制,能够随新一代LLMs的发展而扩展。因此,我们还探索了在多达100个智能体的设置下,前沿模型的表现。
English
Large-language models (LLMs) have demonstrated powerful problem-solving
capabilities, in particular when organized in multi-agent systems. However, the
advent of such systems also raises several questions on the ability of a
complex network of agents to effectively self-organize and collaborate. While
measuring performance on standard reasoning benchmarks indicates how well
multi-agent systems can solve reasoning tasks, it is unclear whether these
systems are able to leverage their topology effectively. Here, we propose
AgentsNet, a new benchmark for multi-agent reasoning. By drawing inspiration
from classical problems in distributed systems and graph theory, AgentsNet
measures the ability of multi-agent systems to collaboratively form strategies
for problem-solving, self-organization, and effective communication given a
network topology. We evaluate a variety of baseline methods on AgentsNet
including homogeneous networks of agents which first have to agree on basic
protocols for organization and communication. We find that some frontier LLMs
are already demonstrating strong performance for small networks but begin to
fall off once the size of the network scales. While existing multi-agent
benchmarks cover at most 2-5 agents, AgentsNet is practically unlimited in size
and can scale with new generations of LLMs. As such, we also probe frontier
models in a setup with up to 100 agents.