LOKI: 大規模なマルチモーダルモデルを使用した包括的な合成データ検出ベンチマーク

要旨

AI生成コンテンツの急速な発展に伴い、将来のインターネットは合成データであふれ、本物と信頼性のある多様なデータを区別することがますます困難になるかもしれません。そのため、合成データの検出は広範な注目を集め、大規模な多様なモデル（LMMs）のこのタスクにおける性能が大きな関心を集めています。LMMsは、合成コンテンツの検出の説明可能性を高めるために、その信頼性判断に対する自然言語の説明を提供できます。同時に、実データと合成データを区別するタスクは、LMMsの知覚、知識、および推論能力を効果的にテストします。このため、私たちはLMMsの合成データ検出能力を評価するために設計された新しいベンチマークLOKIを紹介します。LOKIは、ビデオ、画像、3D、テキスト、オーディオのモダリティを含み、26のサブカテゴリーにまたがる18,000の注意深く選定された質問をカバーしています。このベンチマークには、粗い判断と多肢選択問題、さらに細かい異常選択と説明タスクが含まれており、LMMsの包括的な分析が可能です。私たちは22のオープンソースLMMsと6つのクローズドソースモデルをLOKIで評価し、彼らの合成データ検出としての潜在能力を強調し、またLMM機能の開発におけるいくつかの制限も明らかにしました。LOKIに関する詳細情報は、https://opendatalab.github.io/LOKI/ で入手できます。

English

With the rapid development of AI-generated content, the future internet may be inundated with synthetic data, making the discrimination of authentic and credible multimodal data increasingly challenging. Synthetic data detection has thus garnered widespread attention, and the performance of large multimodal models (LMMs) in this task has attracted significant interest. LMMs can provide natural language explanations for their authenticity judgments, enhancing the explainability of synthetic content detection. Simultaneously, the task of distinguishing between real and synthetic data effectively tests the perception, knowledge, and reasoning capabilities of LMMs. In response, we introduce LOKI, a novel benchmark designed to evaluate the ability of LMMs to detect synthetic data across multiple modalities. LOKI encompasses video, image, 3D, text, and audio modalities, comprising 18K carefully curated questions across 26 subcategories with clear difficulty levels. The benchmark includes coarse-grained judgment and multiple-choice questions, as well as fine-grained anomaly selection and explanation tasks, allowing for a comprehensive analysis of LMMs. We evaluated 22 open-source LMMs and 6 closed-source models on LOKI, highlighting their potential as synthetic data detectors and also revealing some limitations in the development of LMM capabilities. More information about LOKI can be found at https://opendatalab.github.io/LOKI/

LOKI: 大規模なマルチモーダルモデルを使用した包括的な合成データ検出ベンチマーク

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

要旨

Support