AWorld:具備穩定操控能力的動態多智能體系統,用於解決GAIA問題的魯棒性
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving
August 13, 2025
作者: Zhitian Xie, Qintong Wu, Chengyue Yu, Chenyi Zhuang, Jinjie Gu
cs.AI
摘要
大型語言模型(LLMs)的快速發展,使得智能代理能夠利用多樣化的外部工具來解決複雜的現實世界問題。然而,隨著代理日益依賴多種工具,它們面臨新的挑戰:來自不同來源的擴展上下文以及工具輸出的噪聲或不相關性,可能削弱系統的可靠性和準確性。這些挑戰凸顯了增強基於代理的系統穩定性的必要性。為此,我們引入了動態監督與調控機制,在AWorld框架內構建了一個堅固且動態的多代理系統(MAS)架構。在我們的方法中,執行代理在關鍵步驟調用守護代理,以驗證並修正推理過程,有效減少由噪聲引起的錯誤,並增強問題解決的魯棒性。在GAIA測試數據集上的大量實驗表明,我們的動態調控機制顯著提升了解決方案的有效性和穩定性,超越了單代理系統(SAS)和標準工具增強系統。因此,我們的動態MAS系統在著名的GAIA排行榜上取得了開源項目中的首位。這些發現強調了協作代理角色在開發更可靠、更值得信賴的智能系統中的實用價值。
English
The rapid advancement of large language models (LLMs) has empowered
intelligent agents to leverage diverse external tools for solving complex
real-world problems. However, as agents increasingly depend on multiple tools,
they encounter new challenges: extended contexts from disparate sources and
noisy or irrelevant tool outputs can undermine system reliability and accuracy.
These challenges underscore the necessity for enhanced stability in agent-based
systems. To address this, we introduce dynamic supervision and maneuvering
mechanisms, constructing a robust and dynamic Multi-Agent System (MAS)
architecture within the AWorld framework. In our approach, the Execution Agent
invokes the Guard Agent at critical steps to verify and correct the reasoning
process, effectively reducing errors arising from noise and bolstering
problem-solving robustness. Extensive experiments on the GAIA test dataset
reveal that our dynamic maneuvering mechanism significantly improves both the
effectiveness and stability of solutions, outperforming single-agent system
(SAS) and standard tool-augmented systems. As a result, our dynamic MAS system
achieved first place among open-source projects on the prestigious GAIA
leaderboard. These findings highlight the practical value of collaborative
agent roles in developing more reliable and trustworthy intelligent systems.