AWorld:具备稳定操控能力的动态多智能体系统,用于鲁棒的GAIA问题求解
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving
August 13, 2025
作者: Zhitian Xie, Qintong Wu, Chengyue Yu, Chenyi Zhuang, Jinjie Gu
cs.AI
摘要
大型语言模型(LLMs)的快速发展,使得智能代理能够借助多样化的外部工具解决复杂的现实问题。然而,随着代理对多种工具的依赖日益加深,它们面临新的挑战:来自不同来源的扩展上下文以及工具输出的噪声或无关信息,可能削弱系统的可靠性与准确性。这些挑战凸显了增强基于代理系统稳定性的必要性。为此,我们引入了动态监督与调控机制,在AWorld框架内构建了一个稳健且动态的多代理系统(MAS)架构。在我们的方法中,执行代理在关键步骤调用守护代理,以验证并修正推理过程,有效减少由噪声引发的错误,增强问题解决的鲁棒性。在GAIA测试数据集上的大量实验表明,我们的动态调控机制显著提升了解决方案的有效性与稳定性,超越了单代理系统(SAS)及标准工具增强系统。因此,我们的动态MAS系统在享有盛誉的GAIA排行榜上荣登开源项目榜首。这些发现凸显了协作代理角色在开发更可靠、更可信的智能系统中的实际价值。
English
The rapid advancement of large language models (LLMs) has empowered
intelligent agents to leverage diverse external tools for solving complex
real-world problems. However, as agents increasingly depend on multiple tools,
they encounter new challenges: extended contexts from disparate sources and
noisy or irrelevant tool outputs can undermine system reliability and accuracy.
These challenges underscore the necessity for enhanced stability in agent-based
systems. To address this, we introduce dynamic supervision and maneuvering
mechanisms, constructing a robust and dynamic Multi-Agent System (MAS)
architecture within the AWorld framework. In our approach, the Execution Agent
invokes the Guard Agent at critical steps to verify and correct the reasoning
process, effectively reducing errors arising from noise and bolstering
problem-solving robustness. Extensive experiments on the GAIA test dataset
reveal that our dynamic maneuvering mechanism significantly improves both the
effectiveness and stability of solutions, outperforming single-agent system
(SAS) and standard tool-augmented systems. As a result, our dynamic MAS system
achieved first place among open-source projects on the prestigious GAIA
leaderboard. These findings highlight the practical value of collaborative
agent roles in developing more reliable and trustworthy intelligent systems.