SWE-Master：通过后训练释放软件工程智能体的潜力

摘要

在本技术报告中，我们推出SWE-Master——一个开源且完全可复现的后训练框架，用于构建高效的软件工程智能体。该框架系统性地探索了完整的智能体开发流程，包括教师轨迹合成与数据筛选、长周期监督微调、基于真实执行反馈的强化学习以及推理框架设计。以初始软件工程能力有限的开源基础模型为起点，SWE-Master展示了系统化优化方法如何激发强大的长周期软件工程任务解决能力。我们在SWE-bench Verified（现实软件工程任务的标准基准测试）上评估SWE-Master，在相同实验设置下，基于Qwen2.5-Coder-32B模型的方案实现了61.4%的问题解决率，显著超越现有开源基线。通过进一步结合基于LLM的环境反馈进行测试时扩展，SWE-Master在TTS@8设置下达到70.8%的解决率，展现出强劲的性能潜力。该框架为推进软件工程智能体的可复现研究提供了实用且透明的基石。代码已开源：https://github.com/RUCAIBox/SWE-Master。

English

In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, long-horizon SFT, RL with real execution feedback, and inference framework design. Starting from an open-source base model with limited initial SWE capability, SWE-Master demonstrates how systematical optimization method can elicit strong long-horizon SWE task solving abilities. We evaluate SWE-Master on SWE-bench Verified, a standard benchmark for realistic software engineering tasks. Under identical experimental settings, our approach achieves a resolve rate of 61.4\% with Qwen2.5-Coder-32B, substantially outperforming existing open-source baselines. By further incorporating test-time scaling~(TTS) with LLM-based environment feedback, SWE-Master reaches 70.8\% at TTS@8, demonstrating a strong performance potential. SWE-Master provides a practical and transparent foundation for advancing reproducible research on software engineering agents. The code is available at https://github.com/RUCAIBox/SWE-Master.

SWE-Master：通过后训练释放软件工程智能体的潜力

SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training

摘要

Support