ChatPaper.aiChatPaper

多樣性增強智能:整合軟體工程代理人的專業知識

Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

August 13, 2024
作者: Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, Rithesh Murthy, Tian Lan, Lei Li, Renze Lou, Jiacheng Xu, Bo Pang, Yingbo Zhou, Shelby Heinecke, Silvio Savarese, Huan Wang, Caiming Xiong
cs.AI

摘要

大型語言模型(LLM)代理在解決現實世界軟體工程(SWE)問題方面展現出巨大潛力。最先進的開源SWE代理可以解決SWE-Bench Lite中超過27%的真實GitHub問題。然而,這些複雜的代理框架展示出不同的優勢,在某些任務上表現出色,而在其他任務上表現不佳。為了充分利用這些代理的多樣性,我們提出DEI(Diversity Empowered Intelligence),這是一個利用它們獨特專業知識的框架。DEI作為現有SWE代理框架之上的元模組,管理代理集合以增強問題解決能力。實驗結果顯示,由DEI引導的代理委員會能夠大幅超越最佳個別代理的表現。例如,一組開源SWE代理,在SWE-Bench Lite上的最大個別解決率為27.3%,使用DEI可以實現34.3%的解決率,提高25%,超越大多數封閉源解決方案。我們表現最佳的群體在SWE-Bench Lite上以55%的解決率脫穎而出,獲得最高排名。我們的研究結果有助於不斷增長的協作人工智慧系統研究領域,以及它們解決複雜軟體工程挑戰的潛力。
English
Large language model (LLM) agents have shown great potential in solving real-world software engineering (SWE) problems. The most advanced open-source SWE agent can resolve over 27% of real GitHub issues in SWE-Bench Lite. However, these sophisticated agent frameworks exhibit varying strengths, excelling in certain tasks while underperforming in others. To fully harness the diversity of these agents, we propose DEI (Diversity Empowered Intelligence), a framework that leverages their unique expertise. DEI functions as a meta-module atop existing SWE agent frameworks, managing agent collectives for enhanced problem-solving. Experimental results show that a DEI-guided committee of agents is able to surpass the best individual agent's performance by a large margin. For instance, a group of open-source SWE agents, with a maximum individual resolve rate of 27.3% on SWE-Bench Lite, can achieve a 34.3% resolve rate with DEI, making a 25% improvement and beating most closed-source solutions. Our best-performing group excels with a 55% resolve rate, securing the highest ranking on SWE-Bench Lite. Our findings contribute to the growing body of research on collaborative AI systems and their potential to solve complex software engineering challenges.

Summary

AI-Generated Summary

PDF438November 28, 2024