SWE-Debate:面向软件问题解决的竞争性多智能体辩论
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
July 31, 2025
作者: Han Li, Yuling Shi, Shaoxin Lin, Xiaodong Gu, Heng Lian, Xin Wang, Yantao Jia, Tao Huang, Qianxiang Wang
cs.AI
摘要
得益于大型语言模型(LLMs)卓越的推理能力,问题解决领域取得了显著进展。近期,基于代理的框架如SWE-agent通过赋予自主使用工具的代理处理复杂软件工程任务的能力,进一步推动了这一进展。然而,现有的基于代理的问题解决方法主要依赖于代理的独立探索,往往陷入局部解决方案,难以识别跨越代码库不同部分的问题模式。针对这一局限,我们提出了SWE-Debate,一个竞争性的多代理辩论框架,旨在激发多样化的推理路径,实现更为精准的问题定位。SWE-Debate首先通过遍历代码依赖图,生成多条故障传播轨迹作为定位提案。随后,它组织了一场三轮辩论,由沿着故障传播轨迹持有不同推理视角的专门代理参与。这种结构化的竞争促使代理们协作达成一个统一的修复方案。最终,这一统一修复方案被整合进一个基于蒙特卡洛树搜索(MCTS)的代码修改代理中,用于生成补丁。在SWE-bench基准测试上的实验表明,SWE-Debate在开源代理框架中创下了新的最先进记录,并大幅超越了基线模型。
English
Issue resolution has made remarkable progress thanks to the advanced
reasoning capabilities of large language models (LLMs). Recently, agent-based
frameworks such as SWE-agent have further advanced this progress by enabling
autonomous, tool-using agents to tackle complex software engineering tasks.
While existing agent-based issue resolution approaches are primarily based on
agents' independent explorations, they often get stuck in local solutions and
fail to identify issue patterns that span across different parts of the
codebase. To address this limitation, we propose SWE-Debate, a competitive
multi-agent debate framework that encourages diverse reasoning paths and
achieves more consolidated issue localization. SWE-Debate first creates
multiple fault propagation traces as localization proposals by traversing a
code dependency graph. Then, it organizes a three-round debate among
specialized agents, each embodying distinct reasoning perspectives along the
fault propagation trace. This structured competition enables agents to
collaboratively converge on a consolidated fix plan. Finally, this consolidated
fix plan is integrated into an MCTS-based code modification agent for patch
generation. Experiments on the SWE-bench benchmark show that SWE-Debate
achieves new state-of-the-art results in open-source agent frameworks and
outperforms baselines by a large margin.