大语言模型生成的JavaScript代码的隐藏DNA:结构模式实现高精度作者归属
The Hidden DNA of LLM-Generated JavaScript: Structural Patterns Enable High-Accuracy Authorship Attribution
October 12, 2025
作者: Norbert Tihanyi, Bilel Cherif, Richard A. Dubniczky, Mohamed Amine Ferrag, Tamás Bisztray
cs.AI
摘要
本文首次开展大规模研究,探讨由大型语言模型(LLMs)生成的JavaScript代码是否能够揭示其来源模型,从而实现可靠的作品归属识别与模型指纹提取。随着AI生成代码的迅速崛起,归属识别在检测漏洞、标记恶意内容及确保责任追究方面发挥着关键作用。尽管AI与人类检测通常将AI视为单一类别,但我们发现,即便在同一家族或参数规模下的模型中,各个LLM也留下了独特的风格特征。为此,我们引入了LLM-NodeJS数据集,包含来自20个大型语言模型的50,000个Node.js后端程序,每个程序有四种变体,共生成250,000个独特的JavaScript样本,并提供了两种额外表示形式(JSIR和AST),以支持多样化的研究应用。利用此数据集,我们对比了传统机器学习分类器与微调后的Transformer编码器,并推出了CodeT5-JSA,一种基于770M参数CodeT5模型定制的架构,移除了其解码器并修改了分类头。该架构在五类归属识别任务中达到95.8%的准确率,十类为94.6%,二十类为88.5%,超越了BERT、CodeBERT和Longformer等其他测试模型。我们证明,分类器能够捕捉程序数据流和结构中更深层次的风格规律,而非依赖表面特征。因此,即便在代码混淆、注释删除及深度转换后,归属识别依然有效。为支持开放科学与可重复性研究,我们已在GitHub上发布LLM-NodeJS数据集、Google Colab训练脚本及所有相关材料:https://github.com/LLM-NodeJS-dataset。
English
In this paper, we present the first large-scale study exploring whether
JavaScript code generated by Large Language Models (LLMs) can reveal which
model produced it, enabling reliable authorship attribution and model
fingerprinting. With the rapid rise of AI-generated code, attribution is
playing a critical role in detecting vulnerabilities, flagging malicious
content, and ensuring accountability. While AI-vs-human detection usually
treats AI as a single category we show that individual LLMs leave unique
stylistic signatures, even among models belonging to the same family or
parameter size. To this end, we introduce LLM-NodeJS, a dataset of 50,000
Node.js back-end programs from 20 large language models. Each has four
transformed variants, yielding 250,000 unique JavaScript samples and two
additional representations (JSIR and AST) for diverse research applications.
Using this dataset, we benchmark traditional machine learning classifiers
against fine-tuned Transformer encoders and introduce CodeT5-JSA, a custom
architecture derived from the 770M-parameter CodeT5 model with its decoder
removed and a modified classification head. It achieves 95.8% accuracy on
five-class attribution, 94.6% on ten-class, and 88.5% on twenty-class tasks,
surpassing other tested models such as BERT, CodeBERT, and Longformer. We
demonstrate that classifiers capture deeper stylistic regularities in program
dataflow and structure, rather than relying on surface-level features. As a
result, attribution remains effective even after mangling, comment removal, and
heavy code transformations. To support open science and reproducibility, we
release the LLM-NodeJS dataset, Google Colab training scripts, and all related
materials on GitHub: https://github.com/LLM-NodeJS-dataset.