Gen-AIテキストの自動検出：ニューラルモデルを用いた比較フレームワーク

要旨

大規模言語モデルの急速な普及により、人間が記述したテキストとAI生成テキストの識別が著しく困難となり、学術、出版、社会領域にわたる重大な問題が浮上している。本論文は、複数の機械学習ベースの検出器を設計・実装し比較評価することで、AI生成テキスト検出の問題を検討する。多層パーセプトロン、1次元畳み込みニューラルネットワーク、MobileNetベースのCNN、Transformerモデルという4つのニューラルネットワークアーキテクチャを開発し分析する。提案モデルは、ZeroGPT、GPTZero、QuillBot、Originality.AI、Sapling、IsGen、Rephrase、Writerなど、広く利用されているオンライン検出ツールと比較評価される。実験は、COLING多言語データセット（英語とイタリア語の構成）に加えて、芸術とメンタルヘルスに焦点を当てた独自のテーマ別データセットを用いて実施された。結果は、教師あり検出器が、異なる言語や領域にわたって商用ツールよりも安定した堅牢な性能を達成することを示しており、現在の検出戦略の主要な強みと限界を明らかにしている。

English

The rapid proliferation of Large Language Models has significantly increased the difficulty of distinguishing between human-written and AI generated texts, raising critical issues across academic, editorial, and social domains. This paper investigates the problem of AI generated text detection through the design, implementation, and comparative evaluation of multiple machine learning based detectors. Four neural architectures are developed and analyzed: a Multilayer Perceptron, a one-dimensional Convolutional Neural Network, a MobileNet-based CNN, and a Transformer model. The proposed models are benchmarked against widely used online detectors, including ZeroGPT, GPTZero, QuillBot, Originality.AI, Sapling, IsGen, Rephrase, and Writer. Experiments are conducted on the COLING Multilingual Dataset, considering both English and Italian configurations, as well as on an original thematic dataset focused on Art and Mental Health. Results show that supervised detectors achieve more stable and robust performance than commercial tools across different languages and domains, highlighting key strengths and limitations of current detection strategies.

Gen-AIテキストの自動検出：ニューラルモデルを用いた比較フレームワーク

Automatic detection of Gen-AI texts: A comparative framework of neural models

要旨

Support