ChatPaper.aiChatPaper

训练语言模型代理通过CTF-Dojo发现漏洞

Training Language Model Agents to Find Vulnerabilities with CTF-Dojo

August 25, 2025
作者: Terry Yue Zhuo, Dingmin Wang, Hantian Ding, Varun Kumar, Zijian Wang
cs.AI

摘要

大型语言模型(LLMs)在可执行运行时环境中训练时展现出卓越能力,尤其在通过验证反馈循环处理软件工程任务方面表现突出。然而,可扩展且普遍适用的执行基础环境仍然稀缺,这限制了训练更强大机器学习代理的进展。我们推出了CTF-Dojo,这是首个专为通过可验证反馈训练LLMs而设计的大规模可执行运行时环境,包含658个完全功能的夺旗赛(CTF)式挑战,均封装于Docker中,确保可复现性。为实现无需人工干预的快速扩展,我们开发了CTF-Forge,一个自动化流程,能在几分钟内将公开可用的资源转化为即用型执行环境,省去了传统上数周的专家配置时间。我们仅利用CTF-Dojo中486条高质量、执行验证的轨迹训练了基于LLM的代理,在InterCode-CTF、NYU CTF Bench和Cybench三个竞争性基准测试中,相较于强劲基线,实现了最高11.6%的绝对性能提升。我们表现最佳的32B模型达到了31.9%的Pass@1,确立了新的开放权重最先进水平,可与DeepSeek-V3-0324和Gemini-2.5-Flash等前沿模型媲美。通过将CTF式任务定位为可执行代理学习的基准,CTF-Dojo证明了执行基础训练信号不仅有效,而且是推动高性能机器学习代理进步的关键,无需依赖昂贵的专有系统。
English
Large language models (LLMs) have demonstrated exceptional capabilities when trained within executable runtime environments, notably excelling at software engineering tasks through verified feedback loops. Yet, scalable and generalizable execution-grounded environments remain scarce, limiting progress in training more capable ML agents. We introduce CTF-Dojo, the first large-scale executable runtime tailored for training LLMs with verifiable feedback, featuring 658 fully functional Capture-The-Flag (CTF)-style challenges containerized in Docker with guaranteed reproducibility. To enable rapid scaling without manual intervention, we develop CTF-Forge, an automated pipeline that transforms publicly available artifacts into ready-to-use execution environments in minutes, eliminating weeks of expert configuration traditionally required. We trained LLM-based agents on just 486 high-quality, execution-verified trajectories from CTF-Dojo, achieving up to 11.6% absolute gains over strong baselines across three competitive benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best-performing 32B model reaches 31.9% Pass@1, establishing a new open-weight state-of-the-art that rivals frontier models like DeepSeek-V3-0324 and Gemini-2.5-Flash. By framing CTF-style tasks as a benchmark for executable-agent learning, CTF-Dojo demonstrates that execution-grounded training signals are not only effective but pivotal in advancing high-performance ML agents without dependence on costly proprietary systems.
PDF21August 27, 2025