ChatPaper.aiChatPaper

我知晓今夏谁用大语言模型编写了你的代码:基于大语言模型生成代码的风格计量学作者归属分析

I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution

June 18, 2025
作者: Tamas Bisztray, Bilel Cherif, Richard A. Dubniczky, Nils Gruschka, Bertalan Borsos, Mohamed Amine Ferrag, Attila Kovacs, Vasileios Mavroeidis, Norbert Tihanyi
cs.AI

摘要

检测AI生成的代码、深度伪造及其他合成内容正成为一项新兴的研究挑战。随着大型语言模型(LLMs)生成的代码日益普遍,识别每个样本背后的具体模型变得愈发重要。本文首次系统性地研究了针对C程序的LLM作者归属问题。我们发布了CodeT5-Authorship,一种新颖的模型,它仅采用原始CodeT5编码器-解码器架构中的编码器层,舍弃解码器以专注于分类任务。模型编码器的输出(首个标记)经过一个包含GELU激活和dropout的双层分类头处理,生成可能作者的概率分布。为评估我们的方法,我们引入了LLM-AuthorBench,这是一个包含32,000个可编译C程序的基准测试集,这些程序由八种顶尖LLM在多样化任务中生成。我们将我们的模型与七种传统机器学习分类器及八种微调后的Transformer模型进行了对比,包括BERT、RoBERTa、CodeBERT、ModernBERT、DistilBERT、DeBERTa-V3、Longformer以及LoRA微调的Qwen2-1.5B。在二分类任务中,我们的模型在区分如GPT-4.1与GPT-4o等紧密相关模型生成的C程序时,准确率高达97.56%;在五大领先LLM(Gemini 2.5 Flash、Claude 3.5 Haiku、GPT-4.1、Llama 3.3及DeepSeek-V3)间的多类归属任务中,准确率达到95.40%。为支持开放科学,我们已在GitHub上公开了CodeT5-Authorship架构、LLM-AuthorBench基准测试集及所有相关Google Colab脚本:https://github.com/LLMauthorbench/。
English
Detecting AI-generated code, deepfakes, and other synthetic content is an emerging research challenge. As code generated by Large Language Models (LLMs) becomes more common, identifying the specific model behind each sample is increasingly important. This paper presents the first systematic study of LLM authorship attribution for C programs. We released CodeT5-Authorship, a novel model that uses only the encoder layers from the original CodeT5 encoder-decoder architecture, discarding the decoder to focus on classification. Our model's encoder output (first token) is passed through a two-layer classification head with GELU activation and dropout, producing a probability distribution over possible authors. To evaluate our approach, we introduce LLM-AuthorBench, a benchmark of 32,000 compilable C programs generated by eight state-of-the-art LLMs across diverse tasks. We compare our model to seven traditional ML classifiers and eight fine-tuned transformer models, including BERT, RoBERTa, CodeBERT, ModernBERT, DistilBERT, DeBERTa-V3, Longformer, and LoRA-fine-tuned Qwen2-1.5B. In binary classification, our model achieves 97.56% accuracy in distinguishing C programs generated by closely related models such as GPT-4.1 and GPT-4o, and 95.40% accuracy for multi-class attribution among five leading LLMs (Gemini 2.5 Flash, Claude 3.5 Haiku, GPT-4.1, Llama 3.3, and DeepSeek-V3). To support open science, we release the CodeT5-Authorship architecture, the LLM-AuthorBench benchmark, and all relevant Google Colab scripts on GitHub: https://github.com/LLMauthorbench/.
PDF41June 24, 2025