ChatPaper.aiChatPaper

SWE-Exp:經驗驅動的軟體問題解決方案

SWE-Exp: Experience-Driven Software Issue Resolution

July 31, 2025
作者: Silin Chen, Shaoxin Lin, Xiaodong Gu, Yuling Shi, Heng Lian, Longfei Yun, Dong Chen, Weiguo Sun, Lin Cao, Qianxiang Wang
cs.AI

摘要

近期,大型語言模型(LLM)代理在軟件問題解決方面取得了顯著進展,這得益於多代理協作和蒙特卡洛樹搜索(MCTS)等先進技術的應用。然而,現有的代理如同無記憶的探索者,將每個問題視為獨立事件,未能保留或重用先前修復經驗中的知識。這導致了對失敗路徑的重複探索,並錯失了將成功解決方法應用於類似問題的機會。為解決這一問題,我們引入了SWE-Exp,這是一種經驗增強型方法,它從先前的代理軌跡中提煉出簡潔且可操作的經驗,實現了跨問題的持續學習。我們的方法引入了一個多維度的經驗庫,既捕捉成功的修復嘗試,也記錄失敗的案例。具體而言,它從不同層面提取可重用的問題解決知識——從高層次的問題理解到具體的代碼變更。實驗表明,在開源代理框架下,SWE-Exp在SWE-bench-Verified上達到了41.6%的Pass@1解決率,處於領先水平。我們的方法建立了一種新範式,使自動化軟件工程代理能夠系統地積累並利用修復專業知識,從根本上實現了從試錯探索到戰略性、經驗驅動的問題解決的轉變。
English
Recent advances in large language model (LLM) agents have shown remarkable progress in software issue resolution, leveraging advanced techniques such as multi-agent collaboration and Monte Carlo Tree Search (MCTS). However, current agents act as memoryless explorers - treating each problem separately without retaining or reusing knowledge from previous repair experiences. This leads to redundant exploration of failed trajectories and missed chances to adapt successful issue resolution methods to similar problems. To address this problem, we introduce SWE-Exp, an experience - enhanced approach that distills concise and actionable experience from prior agent trajectories, enabling continuous learning across issues. Our method introduces a multi-faceted experience bank that captures both successful and failed repair attempts. Specifically, it extracts reusable issue resolution knowledge at different levels - from high-level problem comprehension to specific code changes. Experiments show that SWE-Exp achieves state-of-the-art resolution rate (41.6% Pass@1) on SWE-bench-Verified under open-source agent frameworks. Our approach establishes a new paradigm in which automated software engineering agents systematically accumulate and leverage repair expertise, fundamentally shifting from trial-and-error exploration to strategic, experience-driven issue resolution.
PDF102August 4, 2025