偽物を見破る：アーティファクト説明付き大規模マルチモーダルモデルによる合成画像検出

要旨

人工知能生成コンテンツ（AIGC）技術の急速な進展に伴い、合成画像が日常生活においてますます普及し、真正性評価と検出に新たな課題を提起しています。既存の手法は画像の真正性評価や偽造箇所の特定において有効であるものの、これらのアプローチは人間による解釈可能性に欠けており、合成データの複雑化に対応しきれていません。これらの課題に対処するため、我々はFakeVLMを提案します。これは、一般的な合成画像とDeepFake検出タスクの両方に特化した大規模マルチモーダルモデルです。FakeVLMは、本物と偽物の画像を区別するだけでなく、画像のアーティファクトに対する明確で自然言語による説明を提供し、解釈可能性を向上させます。さらに、7つのカテゴリーにわたる10万枚以上の画像を含み、自然言語で詳細なアーティファクトの手がかりが注釈付けされた包括的なデータセットFakeClueを提示します。FakeVLMは、追加の分類器を必要とせずに専門家モデルに匹敵する性能を示し、合成データ検出のための堅牢なソリューションとなっています。複数のデータセットにわたる広範な評価により、FakeVLMが真正性分類とアーティファクト説明タスクの両方において優位性を確認し、合成画像検出の新たなベンチマークを確立しました。データセットとコードはhttps://github.com/opendatalab/FakeVLMで公開されます。

English

With the rapid advancement of Artificial Intelligence Generated Content (AIGC) technologies, synthetic images have become increasingly prevalent in everyday life, posing new challenges for authenticity assessment and detection. Despite the effectiveness of existing methods in evaluating image authenticity and locating forgeries, these approaches often lack human interpretability and do not fully address the growing complexity of synthetic data. To tackle these challenges, we introduce FakeVLM, a specialized large multimodal model designed for both general synthetic image and DeepFake detection tasks. FakeVLM not only excels in distinguishing real from fake images but also provides clear, natural language explanations for image artifacts, enhancing interpretability. Additionally, we present FakeClue, a comprehensive dataset containing over 100,000 images across seven categories, annotated with fine-grained artifact clues in natural language. FakeVLM demonstrates performance comparable to expert models while eliminating the need for additional classifiers, making it a robust solution for synthetic data detection. Extensive evaluations across multiple datasets confirm the superiority of FakeVLM in both authenticity classification and artifact explanation tasks, setting a new benchmark for synthetic image detection. The dataset and code will be released in: https://github.com/opendatalab/FakeVLM.