ChatPaper.aiChatPaper

SWE-Flow:以测试驱动的方式合成软件工程数据

SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner

June 10, 2025
作者: Lei Zhang, Jiaxi Yang, Min Yang, Jian Yang, Mouxiang Chen, Jiajun Zhang, Zeyu Cui, Binyuan Hui, Junyang Lin
cs.AI

摘要

我们推出**SWE-Flow**,一个基于测试驱动开发(TDD)的创新数据合成框架。与现有依赖人工提交问题的软件工程数据不同,**SWE-Flow**能够直接从单元测试中自动推断出增量开发步骤,这些测试本质上封装了高层次的需求。**SWE-Flow**的核心在于构建运行时依赖图(RDG),它精确捕捉函数间的交互,从而生成结构化的、逐步推进的*开发计划*。在每一步中,**SWE-Flow**都会生成部分代码库、相应的单元测试以及必要的代码修改,形成完全可验证的TDD任务。通过这种方法,我们从真实的GitHub项目中生成了16,061个训练实例和2,020个测试实例,创建了**SWE-Flow-Eval**基准。实验表明,在此数据集上微调开放模型显著提升了基于TDD的编码性能。为促进进一步研究,我们在[Github](https://github.com/Hambaobao/SWE-Flow)上公开了所有代码、数据集、模型及Docker镜像。
English
We introduce **SWE-Flow**, a novel data synthesis framework grounded in Test-Driven Development (TDD). Unlike existing software engineering data that rely on human-submitted issues, **SWE-Flow** automatically infers incremental development steps directly from unit tests, which inherently encapsulate high-level requirements. The core of **SWE-Flow** is the construction of a Runtime Dependency Graph (RDG), which precisely captures function interactions, enabling the generation of a structured, step-by-step *development schedule*. At each step, **SWE-Flow** produces a partial codebase, the corresponding unit tests, and the necessary code modifications, resulting in fully verifiable TDD tasks. With this approach, we generated 16,061 training instances and 2,020 test instances from real-world GitHub projects, creating the **SWE-Flow-Eval** benchmark. Our experiments show that fine-tuning open model on this dataset significantly improves performance in TDD-based coding. To facilitate further research, we release all code, datasets, models, and Docker images at [Github](https://github.com/Hambaobao/SWE-Flow).
PDF193June 12, 2025