Agent KB:利用跨領域經驗實現自主問題解決
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
July 8, 2025
作者: Xiangru Tang, Tianrui Qin, Tianhao Peng, Ziyang Zhou, Daniel Shao, Tingting Du, Xinming Wei, Peng Xia, Fang Wu, He Zhu, Ge Zhang, Jiaheng Liu, Xingyao Wang, Sirui Hong, Chenglin Wu, Hao Cheng, Chi Wang, Wangchunshu Zhou
cs.AI
摘要
隨著語言代理處理日益複雜的任務,它們在有效錯誤修正和跨領域經驗重用方面面臨挑戰。我們引入了Agent KB,這是一個分層次的經驗框架,通過新穎的「推理-檢索-精煉」流程實現複雜的代理問題解決。Agent KB解決了一個核心限制:傳統上,代理無法從彼此的經驗中學習。通過捕捉高層次策略和詳細執行日誌,Agent KB創建了一個共享知識庫,實現了跨代理的知識轉移。在GAIA基準測試中,Agent KB將成功率提高了最多16.28個百分點。在最具挑戰性的任務上,Claude-3的成功率從38.46%提升至57.69%,而GPT-4在中級任務上的成功率從53.49%提升至73.26%。在SWE-bench代碼修復任務中,Agent KB使Claude-3的成功率從41.33%提升至53.33%。我們的結果表明,Agent KB提供了一個模塊化、框架無關的基礎設施,使代理能夠從過去的經驗中學習,並將成功策略推廣到新任務中。
English
As language agents tackle increasingly complex tasks, they struggle with
effective error correction and experience reuse across domains. We introduce
Agent KB, a hierarchical experience framework that enables complex agentic
problem solving via a novel Reason-Retrieve-Refine pipeline. Agent KB addresses
a core limitation: agents traditionally cannot learn from each other's
experiences. By capturing both high-level strategies and detailed execution
logs, Agent KB creates a shared knowledge base that enables cross-agent
knowledge transfer. Evaluated on the GAIA benchmark, Agent KB improves success
rates by up to 16.28 percentage points. On the most challenging tasks, Claude-3
improves from 38.46% to 57.69%, while GPT-4 improves from 53.49% to 73.26% on
intermediate tasks. On SWE-bench code repair, Agent KB enables Claude-3 to
improve from 41.33% to 53.33%. Our results suggest that Agent KB provides a
modular, framework-agnostic infrastructure for enabling agents to learn from
past experiences and generalize successful strategies to new tasks.