ChatPaper.aiChatPaper

我知曉上個夏日是哪個大型語言模型撰寫了你的代碼:基於LLM生成代碼的風格測量學於作者歸屬之應用

I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution

June 18, 2025
作者: Tamas Bisztray, Bilel Cherif, Richard A. Dubniczky, Nils Gruschka, Bertalan Borsos, Mohamed Amine Ferrag, Attila Kovacs, Vasileios Mavroeidis, Norbert Tihanyi
cs.AI

摘要

檢測由人工智慧生成的程式碼、深度偽造及其他合成內容,已成為一項新興的研究挑戰。隨著大型語言模型(LLMs)生成的程式碼日益普及,識別每個樣本背後的特定模型變得愈發重要。本文首次系統性地研究了針對C語言的LLM作者歸屬問題。我們發布了CodeT5-Authorship,這是一種新穎的模型,僅採用原始CodeT5編碼器-解碼器架構中的編碼器層,捨棄解碼器以專注於分類任務。我們模型的編碼器輸出(首個標記)通過一個帶有GELU激活函數和dropout的兩層分類頭,生成可能作者的概率分佈。為評估我們的方法,我們引入了LLM-AuthorBench,這是一個包含32,000個可編譯C程式碼的基準測試集,這些程式碼由八種最先進的LLM在多樣化任務中生成。我們將我們的模型與七種傳統機器學習分類器及八種微調後的Transformer模型進行了比較,包括BERT、RoBERTa、CodeBERT、ModernBERT、DistilBERT、DeBERTa-V3、Longformer及LoRA微調的Qwen2-1.5B。在二元分類中,我們的模型在區分由密切相關模型(如GPT-4.1與GPT-4o)生成的C程式碼時達到了97.56%的準確率,而在五種領先LLM(Gemini 2.5 Flash、Claude 3.5 Haiku、GPT-4.1、Llama 3.3及DeepSeek-V3)之間的多類歸屬任務中,準確率達到了95.40%。為支持開放科學,我們在GitHub上公開了CodeT5-Authorship架構、LLM-AuthorBench基準測試集及所有相關的Google Colab腳本:https://github.com/LLMauthorbench/。
English
Detecting AI-generated code, deepfakes, and other synthetic content is an emerging research challenge. As code generated by Large Language Models (LLMs) becomes more common, identifying the specific model behind each sample is increasingly important. This paper presents the first systematic study of LLM authorship attribution for C programs. We released CodeT5-Authorship, a novel model that uses only the encoder layers from the original CodeT5 encoder-decoder architecture, discarding the decoder to focus on classification. Our model's encoder output (first token) is passed through a two-layer classification head with GELU activation and dropout, producing a probability distribution over possible authors. To evaluate our approach, we introduce LLM-AuthorBench, a benchmark of 32,000 compilable C programs generated by eight state-of-the-art LLMs across diverse tasks. We compare our model to seven traditional ML classifiers and eight fine-tuned transformer models, including BERT, RoBERTa, CodeBERT, ModernBERT, DistilBERT, DeBERTa-V3, Longformer, and LoRA-fine-tuned Qwen2-1.5B. In binary classification, our model achieves 97.56% accuracy in distinguishing C programs generated by closely related models such as GPT-4.1 and GPT-4o, and 95.40% accuracy for multi-class attribution among five leading LLMs (Gemini 2.5 Flash, Claude 3.5 Haiku, GPT-4.1, Llama 3.3, and DeepSeek-V3). To support open science, we release the CodeT5-Authorship architecture, the LLM-AuthorBench benchmark, and all relevant Google Colab scripts on GitHub: https://github.com/LLMauthorbench/.
PDF41June 24, 2025