ChatPaper.aiChatPaper

基于Transformer的漏洞检测模型在开源与工业数据上的跨领域评估

Cross-Domain Evaluation of Transformer-Based Vulnerability Detection on Open & Industry Data

September 11, 2025
作者: Moritz Mock, Thomas Forrer, Barbara Russo
cs.AI

摘要

学术界提出的深度学习漏洞检测方案并非总能被开发者直接采用,且其在工业环境中的适用性鲜有探讨。将此类技术从学术界迁移至工业界面临诸多挑战,包括可信度问题、遗留系统兼容性、数字素养局限以及学术与工业专业知识间的鸿沟。特别是对于深度学习而言,性能表现及与现有工作流的整合更是额外关切点。本研究首先评估了CodeBERT在检测工业与开源软件中易受攻击函数方面的性能表现,分析了其在开源数据上微调后对工业数据的跨领域泛化能力,反之亦然,并探索了处理类别不平衡的策略。基于这些结果,我们开发了AI-DO(自动化漏洞检测集成开发者操作),一个集成持续集成-持续部署(CI/CD)的推荐系统,它利用微调后的CodeBERT在代码审查过程中检测并定位漏洞,且不中断工作流程。最后,我们通过公司IT专业人员的调查评估了该工具的感知实用性。研究结果表明,基于工业数据训练的模型在同一领域内能准确检测漏洞,但在开源代码上性能下降;而采用适当欠采样技术、在开源数据上微调的深度学习模型,则提升了漏洞检测的效果。
English
Deep learning solutions for vulnerability detection proposed in academic research are not always accessible to developers, and their applicability in industrial settings is rarely addressed. Transferring such technologies from academia to industry presents challenges related to trustworthiness, legacy systems, limited digital literacy, and the gap between academic and industrial expertise. For deep learning in particular, performance and integration into existing workflows are additional concerns. In this work, we first evaluate the performance of CodeBERT for detecting vulnerable functions in industrial and open-source software. We analyse its cross-domain generalisation when fine-tuned on open-source data and tested on industrial data, and vice versa, also exploring strategies for handling class imbalance. Based on these results, we develop AI-DO(Automating vulnerability detection Integration for Developers' Operations), a Continuous Integration-Continuous Deployment (CI/CD)-integrated recommender system that uses fine-tuned CodeBERT to detect and localise vulnerabilities during code review without disrupting workflows. Finally, we assess the tool's perceived usefulness through a survey with the company's IT professionals. Our results show that models trained on industrial data detect vulnerabilities accurately within the same domain but lose performance on open-source code, while a deep learner fine-tuned on open data, with appropriate undersampling techniques, improves the detection of vulnerabilities.
PDF12September 12, 2025