코드 편집 시점에서의 Transformer 기반 취약점 탐지: 제로샷, 퓨샷, 아니면 미세 조정?

초록

소프트웨어 취약점은 기업에 상당한 비용을 초래합니다. 소프트웨어 취약점 탐지 방법에 대한 광범위한 연구와 개발 노력에도 불구하고, 발견되지 않은 취약점들은 여전히 소프트웨어 소유자와 사용자들을 위험에 빠뜨리고 있습니다. 현재 많은 취약점 탐지 방법들은 코드 조각이 컴파일되고 빌드될 수 있어야 탐지를 시도할 수 있습니다. 이는 불행히도 취약점이 주입된 시점부터 제거되는 시점까지의 긴 지연 시간을 초래하며, 이는 취약점을 수정하는 비용을 상당히 증가시킬 수 있습니다. 우리는 기계 학습의 최신 발전을 활용하여 개발자가 코드를 작성하는 동안 구문적으로 불완전한 코드 조각에서도 취약한 코드 패턴을 탐지할 수 있다는 것을 인식했습니다. 본 논문에서는 대규모 취약한 코드 패턴 데이터셋에 대한 딥러닝을 활용하여 250개 이상의 취약점 유형의 복잡한 표현을 학습하고, EditTime에 취약한 코드 패턴을 탐지하는 실용적인 시스템을 제시합니다. 우리는 최신 사전 훈련된 대형 언어 모델(LLM)에 대한 제로샷, 퓨샷, 그리고 미세 조정 접근법을 논의합니다. 우리의 접근법은 최신 취약점 탐지 모델과 비교하여 10%의 성능 향상을 보여줍니다. 또한, 코드 LLM에 의해 자동 생성된 코드에서 취약점을 탐지하기 위한 우리의 접근법을 평가합니다. 고위험 코드 시나리오 벤치마크에서의 평가는 최대 90%의 취약점 감소를 보여줍니다.

English

Software vulnerabilities bear enterprises significant costs. Despite extensive efforts in research and development of software vulnerability detection methods, uncaught vulnerabilities continue to put software owners and users at risk. Many current vulnerability detection methods require that code snippets can compile and build before attempting detection. This, unfortunately, introduces a long latency between the time a vulnerability is injected to the time it is removed, which can substantially increases the cost of fixing a vulnerability. We recognize that the current advances in machine learning can be used to detect vulnerable code patterns on syntactically incomplete code snippets as the developer is writing the code at EditTime. In this paper we present a practical system that leverages deep learning on a large-scale data set of vulnerable code patterns to learn complex manifestations of more than 250 vulnerability types and detect vulnerable code patterns at EditTime. We discuss zero-shot, few-shot, and fine-tuning approaches on state of the art pre-trained Large Language Models (LLMs). We show that in comparison with state of the art vulnerability detection models our approach improves the state of the art by 10%. We also evaluate our approach to detect vulnerability in auto-generated code by code LLMs. Evaluation on a benchmark of high-risk code scenarios shows a reduction of up to 90% vulnerability reduction.

코드 편집 시점에서의 Transformer 기반 취약점 탐지: 제로샷, 퓨샷, 아니면 미세 조정?

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

초록

Support