訓練語言模型代理以CTF-Dojo尋找漏洞

摘要

大型語言模型（LLMs）在可執行的運行時環境中訓練時，展現了卓越的能力，特別是在通過驗證反饋循環處理軟件工程任務方面表現突出。然而，可擴展且具普遍性的執行基礎環境仍然稀缺，這限制了訓練更強大機器學習代理的進展。我們推出了CTF-Dojo，這是首個專為訓練LLMs而設計的大規模可執行運行時環境，配備了658個完全功能的奪旗賽（CTF）式挑戰，這些挑戰被容器化在Docker中，確保了可重現性。為了實現無需人工干預的快速擴展，我們開發了CTF-Forge，這是一個自動化管道，能夠在幾分鐘內將公開可用的工件轉化為即用型執行環境，省去了傳統上需要數週專家配置的時間。我們僅使用來自CTF-Dojo的486條高質量、執行驗證的軌跡來訓練基於LLM的代理，在三個競爭性基準測試中取得了高達11.6%的絕對增益：InterCode-CTF、NYU CTF Bench和Cybench。我們表現最佳的32B模型達到了31.9%的Pass@1，創下了新的開放權重最先進水平，與DeepSeek-V3-0324和Gemini-2.5-Flash等前沿模型相媲美。通過將CTF式任務定位為可執行代理學習的基準，CTF-Dojo證明了基於執行的訓練信號不僅有效，而且對於推進高性能機器學習代理的發展至關重要，而無需依賴昂貴的專有系統。

English

Large language models (LLMs) have demonstrated exceptional capabilities when trained within executable runtime environments, notably excelling at software engineering tasks through verified feedback loops. Yet, scalable and generalizable execution-grounded environments remain scarce, limiting progress in training more capable ML agents. We introduce CTF-Dojo, the first large-scale executable runtime tailored for training LLMs with verifiable feedback, featuring 658 fully functional Capture-The-Flag (CTF)-style challenges containerized in Docker with guaranteed reproducibility. To enable rapid scaling without manual intervention, we develop CTF-Forge, an automated pipeline that transforms publicly available artifacts into ready-to-use execution environments in minutes, eliminating weeks of expert configuration traditionally required. We trained LLM-based agents on just 486 high-quality, execution-verified trajectories from CTF-Dojo, achieving up to 11.6% absolute gains over strong baselines across three competitive benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best-performing 32B model reaches 31.9% Pass@1, establishing a new open-weight state-of-the-art that rivals frontier models like DeepSeek-V3-0324 and Gemini-2.5-Flash. By framing CTF-style tasks as a benchmark for executable-agent learning, CTF-Dojo demonstrates that execution-grounded training signals are not only effective but pivotal in advancing high-performance ML agents without dependence on costly proprietary systems.

訓練語言模型代理以CTF-Dojo尋找漏洞

Training Language Model Agents to Find Vulnerabilities with CTF-Dojo

摘要

Support