DeMamba:在百萬規模的GenVideo基準測試上進行的AI生成視頻檢測
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
May 30, 2024
作者: Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li
cs.AI
摘要
近年來,影片生成技術迅速發展。考慮到社交媒體平台上影片內容的普及,這些模型加劇了人們對虛假信息傳播的擔憂。因此,對於能夠區分假的 AI 生成影片並減輕虛假信息可能帶來的損害的檢測器需求日益增加。然而,來自最先進的影片生成器的大規模數據集的缺乏阻礙了這些檢測器的發展。為了彌補這一差距,我們介紹了第一個 AI 生成影片檢測數據集 GenVideo。它具有以下特點:(1) 包括超過一百萬個 AI 生成和真實影片的大量影片;(2) 包含豐富多樣的生成內容和方法,涵蓋廣泛的影片類別和生成技術。我們對數據集進行了廣泛研究,並提出了兩種針對現實場景設計的評估方法來評估檢測器的性能:跨生成器影片分類任務評估了訓練過的檢測器在生成器上的泛化能力;降質影片分類任務評估了檢測器處理在傳播過程中質量下降的影片的韌性。此外,我們引入了一個即插即用的模塊,名為 Detail Mamba(DeMamba),旨在通過分析時間和空間維度的不一致性來增強檢測器,以識別 AI 生成影片。我們的廣泛實驗表明,與現有的檢測器相比,DeMamba 在 GenVideo 上具有卓越的泛化能力和韌性。我們相信 GenVideo 數據集和 DeMamba 模塊將顯著推動 AI 生成影片檢測領域的發展。我們的代碼和數據集將在 https://github.com/chenhaoxing/DeMamba 上提供。
English
Recently, video generation techniques have advanced rapidly. Given the
popularity of video content on social media platforms, these models intensify
concerns about the spread of fake information. Therefore, there is a growing
demand for detectors capable of distinguishing between fake AI-generated videos
and mitigating the potential harm caused by fake information. However, the lack
of large-scale datasets from the most advanced video generators poses a barrier
to the development of such detectors. To address this gap, we introduce the
first AI-generated video detection dataset, GenVideo. It features the following
characteristics: (1) a large volume of videos, including over one million
AI-generated and real videos collected; (2) a rich diversity of generated
content and methodologies, covering a broad spectrum of video categories and
generation techniques. We conducted extensive studies of the dataset and
proposed two evaluation methods tailored for real-world-like scenarios to
assess the detectors' performance: the cross-generator video classification
task assesses the generalizability of trained detectors on generators; the
degraded video classification task evaluates the robustness of detectors to
handle videos that have degraded in quality during dissemination. Moreover, we
introduced a plug-and-play module, named Detail Mamba (DeMamba), designed to
enhance the detectors by identifying AI-generated videos through the analysis
of inconsistencies in temporal and spatial dimensions. Our extensive
experiments demonstrate DeMamba's superior generalizability and robustness on
GenVideo compared to existing detectors. We believe that the GenVideo dataset
and the DeMamba module will significantly advance the field of AI-generated
video detection. Our code and dataset will be aviliable at
https://github.com/chenhaoxing/DeMamba.Summary
AI-Generated Summary