TacoMAS: 基于大语言模型的多智能体系统中拓扑与能力的测试时协同进化
TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems
May 10, 2026
作者: Chen Xu, Yicheng Hu, Ruizi Wang, Xinyu Lin, Wenjie Wang, Dongrui Liu, Fuli Feng
cs.AI
摘要
多智能体系统(MAS)已成为解决复杂任务的一种有前景的范式。近期研究探索了能够自动优化智能体能力或通信拓扑结构的自进化多智能体系统。然而,现有方法要么学习一个在推理时保持固定的拓扑结构,要么仅在推理过程中调整拓扑或能力。我们通过实验和理论证明,有效的测试时演化需要同时调整这两个维度,但需在不同的时间尺度上进行:能力应快速更新以应对新出现的子任务,而拓扑结构则应较慢演化以保持协调稳定性。为此,我们提出TacoMAS——一种用于动态多智能体系统的测试时协同演化框架。TacoMAS将多智能体系统推理形式化为在线图适配任务,其中节点代表具有角色特定能力的智能体,边定义其通信拓扑结构。在推理过程中,快速能力环利用轨迹级反馈更新智能体的专长,而由元大语言模型驱动的慢速拓扑环则对多智能体系统执行智能体的出生-死亡操作,包括边编辑、智能体添加和智能体移除。我们进一步表明,这种快慢设计驱动多智能体系统演化至任务条件下的稳定均衡状态。在四个基准测试上的实验表明,TacoMAS优于近20个多智能体基线方法,相比最强基线实现了平均13.3%的性能提升。代码已开源至 https://github.com/chenxu2-gif/TacoMAS-MultiAgent。
English
Multi-agent systems (MAS) have emerged as a promising paradigm for solving complex tasks. Recent work has explored self-evolving MAS that automatically optimize agent capabilities or communication topologies. However, existing methods either learn a topology that remains fixed at inference time or adapt only the topology or capability during inference. We empirically and theoretically show that effective test-time evolution requires jointly adapting both axes, but on different time scales: capabilities should update rapidly to handle emerging subtasks, while the topology should evolve more slowly to preserve coordination stability. We then introduce TacoMAS, a test-time co-evolution framework for dynamic MAS. TacoMAS formulates MAS inference as a task of online graph adaptation, where nodes represent agents with role-specific capabilities and edges define their communication topology. During inference, a fast capability loop updates agent expertise using trajectory-level feedback, while a slow meta-LLM-driven topology loop performs agents' birth-death operations on MAS, including edge edit, agent addition, and agent removal. We further show that this fast-slow design drives MAS evolution toward a task-conditioned stable equilibrium. Experiments on four benchmarks demonstrate that TacoMAS outperforms nearly 20 multi-agent baselines, achieving an average improvement of 13.3% over the strongest baseline. The codes are released at https://github.com/chenxu2-gif/TacoMAS-MultiAgent.