深言數據集 v1.0

摘要

我們描述了一個大規模數據集--{\em DeepSpeak}--其中包含真實和深度偽造的人們在網絡攝像頭前說話和做手勢的影片。該數據集的第一個版本中，真實影片包括來自220位不同個體的9小時影片。偽造影片包括一系列不同的最先進的臉部交換和嘴唇同步深度偽造，具有自然和由人工智慧生成的聲音，總計超過25小時的影片。我們預計將來會釋出此數據集的不同和更新的深度偽造技術版本。該數據集可供研究和非商業用途免費使用；對於商業用途的請求將被考慮。

English

We describe a large-scale dataset--{\em DeepSpeak}--of real and deepfake footage of people talking and gesturing in front of their webcams. The real videos in this first version of the dataset consist of 9 hours of footage from 220 diverse individuals. Constituting more than 25 hours of footage, the fake videos consist of a range of different state-of-the-art face-swap and lip-sync deepfakes with natural and AI-generated voices. We expect to release future versions of this dataset with different and updated deepfake technologies. This dataset is made freely available for research and non-commercial uses; requests for commercial use will be considered.

深言數據集 v1.0

DeepSpeak Dataset v1.0

摘要

Support