SWE-Master : Libérer le potentiel des agents d'ingénierie logicielle par post-formation

papers.abstract

Dans ce rapport technique, nous présentons SWE-Master, un cadre post-entraînement open source et entièrement reproductible pour construire des agents efficaces en génie logiciel. SWE-Master explore systématiquement l'ensemble du pipeline de développement d'agents, incluant la synthèse de trajectoires enseignantes et la curation des données, l'apprentissage par fine-tuning supervisé à long horizon, l'apprentissage par renforcement avec retour d'exécution réel, et la conception du cadre d'inférence. En partant d'un modèle de base open source ayant des capacités initiales limitées en génie logiciel, SWE-Master démontre comment une méthode d'optimisation systématique peut susciter de solides capacités de résolution de tâches complexes à long horizon. Nous évaluons SWE-Master sur SWE-bench Verified, un benchmark standard pour les tâches réalistes de génie logiciel. Dans des conditions expérimentales identiques, notre approche atteint un taux de résolution de 61,4 % avec Qwen2.5-Coder-32B, surpassant substantiellement les solutions open source existantes. En intégrant davantage la mise à l'échelle au moment du test (TTS) avec un retour d'environnement basé sur LLM, SWE-Master atteint 70,8 % à TTS@8, démontrant un fort potentiel de performance. SWE-Master fournit une base pratique et transparente pour faire progresser la recherche reproductible sur les agents de génie logiciel. Le code est disponible à l'adresse https://github.com/RUCAIBox/SWE-Master.

English

In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, long-horizon SFT, RL with real execution feedback, and inference framework design. Starting from an open-source base model with limited initial SWE capability, SWE-Master demonstrates how systematical optimization method can elicit strong long-horizon SWE task solving abilities. We evaluate SWE-Master on SWE-bench Verified, a standard benchmark for realistic software engineering tasks. Under identical experimental settings, our approach achieves a resolve rate of 61.4\% with Qwen2.5-Coder-32B, substantially outperforming existing open-source baselines. By further incorporating test-time scaling~(TTS) with LLM-based environment feedback, SWE-Master reaches 70.8\% at TTS@8, demonstrating a strong performance potential. SWE-Master provides a practical and transparent foundation for advancing reproducible research on software engineering agents. The code is available at https://github.com/RUCAIBox/SWE-Master.

SWE-Master : Libérer le potentiel des agents d'ingénierie logicielle par post-formation

SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training

papers.abstract

Support