深语数据集 v1.0

摘要

我们描述了一个大规模数据集——{\em DeepSpeak}——其中包含真实和深度伪造的人们在网络摄像头前说话和做手势的视频。该数据集的第一个版本中，真实视频包括来自220名不同个体的9小时录影。虚假视频包括一系列不同的最先进的换脸和嘴唇同步深度伪造视频，总计超过25小时，具有自然和人工智能生成的声音。我们计划发布该数据集的未来版本，其中将包含不同和更新的深度伪造技术。该数据集可供研究和非商业用途免费使用；对于商业使用的请求将予考虑。

English

We describe a large-scale dataset--{\em DeepSpeak}--of real and deepfake footage of people talking and gesturing in front of their webcams. The real videos in this first version of the dataset consist of 9 hours of footage from 220 diverse individuals. Constituting more than 25 hours of footage, the fake videos consist of a range of different state-of-the-art face-swap and lip-sync deepfakes with natural and AI-generated voices. We expect to release future versions of this dataset with different and updated deepfake technologies. This dataset is made freely available for research and non-commercial uses; requests for commercial use will be considered.

深语数据集 v1.0

DeepSpeak Dataset v1.0

摘要

Support