LLM生成JavaScriptの隠されたDNA：構造パターンが高精度な著者帰属を可能にする

要旨

本論文では、大規模言語モデル（LLM）によって生成されたJavaScriptコードが、どのモデルによって生成されたかを明らかにし、信頼性のある著者帰属とモデルのフィンガープリンティングを可能にするかどうかを探る初の大規模研究を提示します。AI生成コードの急速な台頭に伴い、脆弱性の検出、悪意のあるコンテンツのフラグ付け、責任の確保において、帰属が重要な役割を果たしています。AI対人間の検出では通常、AIを単一のカテゴリーとして扱いますが、我々は個々のLLMが、同じファミリーやパラメータサイズのモデル間であっても、独自のスタイル的署名を残すことを示します。この目的のために、20の大規模言語モデルから生成された50,000のNode.jsバックエンドプログラムを含むLLM-NodeJSデータセットを紹介します。各プログラムには4つの変換バリアントがあり、250,000のユニークなJavaScriptサンプルと、多様な研究アプリケーションのための2つの追加表現（JSIRとAST）が得られます。このデータセットを使用して、従来の機械学習分類器と微調整されたTransformerエンコーダをベンチマークし、770MパラメータのCodeT5モデルから派生したカスタムアーキテクチャであるCodeT5-JSAを紹介します。これはデコーダを除去し、修正された分類ヘッドを備えており、5クラス帰属で95.8%、10クラスで94.6%、20クラスで88.5%の精度を達成し、BERT、CodeBERT、Longformerなどの他のテスト済みモデルを凌駕します。分類器が、プログラムのデータフローと構造における深いスタイル的規則性を捉え、表面的な特徴に依存しないことを示します。その結果、マングリング、コメント削除、大幅なコード変換後でも、帰属が有効であることを示します。オープンサイエンスと再現性を支援するため、LLM-NodeJSデータセット、Google Colabトレーニングスクリプト、および関連資料をGitHubで公開しています: https://github.com/LLM-NodeJS-dataset。

English

In this paper, we present the first large-scale study exploring whether JavaScript code generated by Large Language Models (LLMs) can reveal which model produced it, enabling reliable authorship attribution and model fingerprinting. With the rapid rise of AI-generated code, attribution is playing a critical role in detecting vulnerabilities, flagging malicious content, and ensuring accountability. While AI-vs-human detection usually treats AI as a single category we show that individual LLMs leave unique stylistic signatures, even among models belonging to the same family or parameter size. To this end, we introduce LLM-NodeJS, a dataset of 50,000 Node.js back-end programs from 20 large language models. Each has four transformed variants, yielding 250,000 unique JavaScript samples and two additional representations (JSIR and AST) for diverse research applications. Using this dataset, we benchmark traditional machine learning classifiers against fine-tuned Transformer encoders and introduce CodeT5-JSA, a custom architecture derived from the 770M-parameter CodeT5 model with its decoder removed and a modified classification head. It achieves 95.8% accuracy on five-class attribution, 94.6% on ten-class, and 88.5% on twenty-class tasks, surpassing other tested models such as BERT, CodeBERT, and Longformer. We demonstrate that classifiers capture deeper stylistic regularities in program dataflow and structure, rather than relying on surface-level features. As a result, attribution remains effective even after mangling, comment removal, and heavy code transformations. To support open science and reproducibility, we release the LLM-NodeJS dataset, Google Colab training scripts, and all related materials on GitHub: https://github.com/LLM-NodeJS-dataset.

LLM生成JavaScriptの隠されたDNA：構造パターンが高精度な著者帰属を可能にする

The Hidden DNA of LLM-Generated JavaScript: Structural Patterns Enable High-Accuracy Authorship Attribution

要旨

Support