以1美元修复7400个漏洞：低成本崩溃现场程序修复

摘要

漏洞检测技术的快速发展使得发现的漏洞数量远超开发者能够及时修复的能力，这催生了对高效自动化程序修复（APR）方法的迫切需求。然而，现代漏洞的复杂性常常导致精确的根因分析变得困难且不可靠。为应对这一挑战，我们提出了崩溃现场修复方法，以简化修复任务，同时仍能有效降低被利用的风险。此外，我们引入了一种模板引导的补丁生成策略，该策略在保持修复效率与效果的同时，显著降低了大型语言模型（LLMs）的令牌消耗。我们实现了原型系统WILLIAMT，并与当前最先进的APR工具进行了对比评估。结果显示，当与表现最佳的代理CodeRover-S结合使用时，WILLIAMT在开源软件漏洞基准测试ARVO上将令牌成本降低了45.9%，并将漏洞修复率提升至73.5%（提升了29.6%）。更重要的是，我们证明了WILLIAMT即便在没有前沿LLMs支持的情况下也能有效工作：即便是在Mac M4 Mini上运行的本地模型，也能达到合理的修复率。这些发现凸显了WILLIAMT广泛的适用性和可扩展性。

English

The rapid advancement of bug-finding techniques has led to the discovery of more vulnerabilities than developers can reasonably fix, creating an urgent need for effective Automated Program Repair (APR) methods. However, the complexity of modern bugs often makes precise root cause analysis difficult and unreliable. To address this challenge, we propose crash-site repair to simplify the repair task while still mitigating the risk of exploitation. In addition, we introduce a template-guided patch generation approach that significantly reduces the token cost of Large Language Models (LLMs) while maintaining both efficiency and effectiveness. We implement our prototype system, WILLIAMT, and evaluate it against state-of-the-art APR tools. Our results show that, when combined with the top-performing agent CodeRover-S, WILLIAMT reduces token cost by 45.9% and increases the bug-fixing rate to 73.5% (+29.6%) on ARVO, a ground-truth open source software vulnerabilities benchmark. Furthermore, we demonstrate that WILLIAMT can function effectively even without access to frontier LLMs: even a local model running on a Mac M4 Mini achieves a reasonable repair rate. These findings highlight the broad applicability and scalability of WILLIAMT.

以1美元修复7400个漏洞：低成本崩溃现场程序修复

Fixing 7,400 Bugs for 1$: Cheap Crash-Site Program Repair

摘要

Support