ChatPaper.aiChatPaper

DiffTester:通过重复模式加速扩散式大语言模型的单元测试生成

DiffTester: Accelerating Unit Test Generation for Diffusion LLMs via Repetitive Pattern

September 29, 2025
作者: Lekang Yang, Yuetong Liu, Yitong Zhang, Jia Li
cs.AI

摘要

软件开发高度依赖大量的单元测试,这使得自动化单元测试生成(UTG)的效率尤为重要。然而,现有的大多数大型语言模型(LLMs)在每次前向传播中仅能逐个生成测试用例,导致UTG效率低下。近期,扩散式大型语言模型(dLLMs)崭露头角,展现出并行生成的潜力,为高效UTG带来了希望。尽管具备这一优势,dLLMs在UTG中的应用仍受限于效率与测试质量之间的明显权衡,因为每步生成更多token往往会导致测试用例质量急剧下降。为克服这一局限,我们提出了DiffTester,一个专为dLLMs在UTG中设计的加速框架。DiffTester的核心思想在于,针对同一核心方法的单元测试常共享重复的结构模式。通过生成过程中基于抽象语法树的动态分析识别这些共同模式,DiffTester自适应地增加每步生成的token数量,同时不牺牲输出质量。为进行全面评估,我们扩展了原本仅限于Python的TestEval基准,引入了包括Java和C++在内的更多编程语言。在三个基准上使用两种代表性模型进行的大量实验表明,DiffTester在保持测试覆盖率的同时实现了显著加速。此外,DiffTester在不同dLLMs和编程语言间展现出良好的泛化能力,为软件开发中的高效UTG提供了一个实用且可扩展的解决方案。代码与数据已公开于https://github.com/wellbeingyang/DLM4UTG-open。
English
Software development relies heavily on extensive unit testing, which makes the efficiency of automated Unit Test Generation (UTG) particularly important. However, most existing LLMs generate test cases one token at a time in each forward pass, which leads to inefficient UTG. Recently, diffusion LLMs (dLLMs) have emerged, offering promising parallel generation capabilities and showing strong potential for efficient UTG. Despite this advantage, their application to UTG is still constrained by a clear trade-off between efficiency and test quality, since increasing the number of tokens generated in each step often causes a sharp decline in the quality of test cases. To overcome this limitation, we present DiffTester, an acceleration framework specifically tailored for dLLMs in UTG. The key idea of DiffTester is that unit tests targeting the same focal method often share repetitive structural patterns. By dynamically identifying these common patterns through abstract syntax tree analysis during generation, DiffTester adaptively increases the number of tokens produced at each step without compromising the quality of the output. To enable comprehensive evaluation, we extend the original TestEval benchmark, which was limited to Python, by introducing additional programming languages including Java and C++. Extensive experiments on three benchmarks with two representative models show that DiffTester delivers significant acceleration while preserving test coverage. Moreover, DiffTester generalizes well across different dLLMs and programming languages, providing a practical and scalable solution for efficient UTG in software development. Code and data are publicly available at https://github.com/wellbeingyang/DLM4UTG-open .
PDF12October 6, 2025