基於Transformer的程式碼編輯時漏洞檢測: 零樣本、少樣本或微調?
Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?
May 23, 2023
作者: Aaron Chan, Anant Kharkar, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Alec Helyar, Eslam Kamal, Mohamed Elkamhawy, Neel Sundaresan
cs.AI
摘要
軟體漏洞給企業帶來重大成本。儘管在軟體漏洞檢測方法的研究和開發方面進行了大量努力,但仍然有未被發現的漏洞持續使軟體擁有者和使用者面臨風險。許多當前的漏洞檢測方法要求程式碼片段在進行檢測之前能夠編譯和構建。不幸的是,這導致了在注入漏洞和移除漏洞之間存在著長時間延遲,這可能會大幅增加修復漏洞的成本。我們認識到目前機器學習的進展可以用於在開發人員編寫程式碼時在EditTime上檢測到具有漏洞的程式碼模式,即使程式碼片段在語法上是不完整的。在本文中,我們提出了一個實用系統,利用大規模數據集中的具有漏洞的程式碼模式,運用深度學習學習超過250種漏洞類型的複雜表現形式,並在EditTime檢測到具有漏洞的程式碼模式。我們討論了在最先進的預訓練大型語言模型(LLMs)上的零樣本、少樣本和微調方法。我們展示了與最先進的漏洞檢測模型相比,我們的方法將技術水平提高了10%。我們還評估了我們的方法在由程式碼LLMs生成的程式碼中檢測漏洞。在一個高風險程式碼情景基準上的評估顯示漏洞減少高達90%。
English
Software vulnerabilities bear enterprises significant costs. Despite
extensive efforts in research and development of software vulnerability
detection methods, uncaught vulnerabilities continue to put software owners and
users at risk. Many current vulnerability detection methods require that code
snippets can compile and build before attempting detection. This,
unfortunately, introduces a long latency between the time a vulnerability is
injected to the time it is removed, which can substantially increases the cost
of fixing a vulnerability. We recognize that the current advances in machine
learning can be used to detect vulnerable code patterns on syntactically
incomplete code snippets as the developer is writing the code at EditTime. In
this paper we present a practical system that leverages deep learning on a
large-scale data set of vulnerable code patterns to learn complex
manifestations of more than 250 vulnerability types and detect vulnerable code
patterns at EditTime. We discuss zero-shot, few-shot, and fine-tuning
approaches on state of the art pre-trained Large Language Models (LLMs). We
show that in comparison with state of the art vulnerability detection models
our approach improves the state of the art by 10%. We also evaluate our
approach to detect vulnerability in auto-generated code by code LLMs.
Evaluation on a benchmark of high-risk code scenarios shows a reduction of up
to 90% vulnerability reduction.