DeMamba:百万规模GenVideo基准下的AI生成视频检测
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
May 30, 2024
作者: Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li
cs.AI
摘要
近年来,视频生成技术迅速发展。鉴于社交媒体平台上视频内容的普及,这些模型加剧了人们对虚假信息传播的担忧。因此,对于能够区分伪造的人工智能生成视频并减轻虚假信息可能带来的危害的检测器的需求与日俱增。然而,最先进的视频生成器缺乏大规模数据集,这对这类检测器的开发构成了障碍。为填补这一空白,我们引入了第一个人工智能生成视频检测数据集GenVideo。它具有以下特点:(1)包括超过一百万个人工智能生成和真实视频的大量视频;(2)生成内容和方法的丰富多样性,涵盖广泛的视频类别和生成技术。我们对数据集进行了广泛研究,并提出了两种针对真实世界场景量身定制的评估方法,以评估检测器的性能:跨生成器视频分类任务评估了训练检测器在生成器上的泛化能力;降质视频分类任务评估了检测器处理在传播过程中质量下降的视频的鲁棒性。此外,我们引入了一个即插即用的模块,名为Detail Mamba(DeMamba),旨在通过分析时间和空间维度的不一致性来增强检测器,从而识别人工智能生成视频。我们的广泛实验表明,与现有检测器相比,DeMamba在GenVideo上具有更好的泛化能力和鲁棒性。我们相信GenVideo数据集和DeMamba模块将显著推动人工智能生成视频检测领域的发展。我们的代码和数据集将在https://github.com/chenhaoxing/DeMamba 上提供。
English
Recently, video generation techniques have advanced rapidly. Given the
popularity of video content on social media platforms, these models intensify
concerns about the spread of fake information. Therefore, there is a growing
demand for detectors capable of distinguishing between fake AI-generated videos
and mitigating the potential harm caused by fake information. However, the lack
of large-scale datasets from the most advanced video generators poses a barrier
to the development of such detectors. To address this gap, we introduce the
first AI-generated video detection dataset, GenVideo. It features the following
characteristics: (1) a large volume of videos, including over one million
AI-generated and real videos collected; (2) a rich diversity of generated
content and methodologies, covering a broad spectrum of video categories and
generation techniques. We conducted extensive studies of the dataset and
proposed two evaluation methods tailored for real-world-like scenarios to
assess the detectors' performance: the cross-generator video classification
task assesses the generalizability of trained detectors on generators; the
degraded video classification task evaluates the robustness of detectors to
handle videos that have degraded in quality during dissemination. Moreover, we
introduced a plug-and-play module, named Detail Mamba (DeMamba), designed to
enhance the detectors by identifying AI-generated videos through the analysis
of inconsistencies in temporal and spatial dimensions. Our extensive
experiments demonstrate DeMamba's superior generalizability and robustness on
GenVideo compared to existing detectors. We believe that the GenVideo dataset
and the DeMamba module will significantly advance the field of AI-generated
video detection. Our code and dataset will be aviliable at
https://github.com/chenhaoxing/DeMamba.Summary
AI-Generated Summary