AgentsNet:多智能體大語言模型中的協調與協同推理
AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
July 11, 2025
作者: Florian Grötschla, Luis Müller, Jan Tönshoff, Mikhail Galkin, Bryan Perozzi
cs.AI
摘要
大型語言模型(LLMs)已展現出強大的問題解決能力,尤其是在多代理系統中的組織應用。然而,這類系統的出現也引發了關於複雜代理網絡能否有效自我組織與協作的諸多疑問。雖然在標準推理基準上的性能測量能反映多代理系統解決推理任務的能力,但這些系統是否能有效利用其拓撲結構仍不明確。為此,我們提出了AgentsNet,一個新的多代理推理基準。借鑒分佈式系統和圖論中的經典問題,AgentsNet旨在衡量多代理系統在給定網絡拓撲下,協作制定問題解決策略、自我組織及有效溝通的能力。我們在AgentsNet上評估了多種基線方法,包括首先需就組織與通信基本協議達成一致的同質代理網絡。我們發現,一些前沿的LLMs在小型網絡中已表現出強勁性能,但隨網絡規模擴大,其表現開始下滑。現有的多代理基準最多涵蓋2至5個代理,而AgentsNet在規模上幾乎無限制,可隨新一代LLMs的發展而擴展。因此,我們還在包含多達100個代理的設置中探測了前沿模型的能力。
English
Large-language models (LLMs) have demonstrated powerful problem-solving
capabilities, in particular when organized in multi-agent systems. However, the
advent of such systems also raises several questions on the ability of a
complex network of agents to effectively self-organize and collaborate. While
measuring performance on standard reasoning benchmarks indicates how well
multi-agent systems can solve reasoning tasks, it is unclear whether these
systems are able to leverage their topology effectively. Here, we propose
AgentsNet, a new benchmark for multi-agent reasoning. By drawing inspiration
from classical problems in distributed systems and graph theory, AgentsNet
measures the ability of multi-agent systems to collaboratively form strategies
for problem-solving, self-organization, and effective communication given a
network topology. We evaluate a variety of baseline methods on AgentsNet
including homogeneous networks of agents which first have to agree on basic
protocols for organization and communication. We find that some frontier LLMs
are already demonstrating strong performance for small networks but begin to
fall off once the size of the network scales. While existing multi-agent
benchmarks cover at most 2-5 agents, AgentsNet is practically unlimited in size
and can scale with new generations of LLMs. As such, we also probe frontier
models in a setup with up to 100 agents.