ChatPaper.aiChatPaper

DisCo:在现实世界中生成指定人类舞蹈的解缠控制

DisCo: Disentangled Control for Referring Human Dance Generation in Real World

June 30, 2023
作者: Tan Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang
cs.AI

摘要

生成式人工智能在计算机视觉领域取得了重大进展,特别是在基于文本描述的图像/视频合成方面。尽管取得了进展,但在生成以人类为中心的内容,如舞蹈合成方面仍然具有挑战性。现有的舞蹈合成方法在合成内容与真实舞蹈场景之间存在困难。本文中,我们定义了一个新的问题设置:指代人类舞蹈生成,重点关注具有三个重要属性的真实舞蹈场景:(i)忠实性:合成应保留参考图像中的人物主体和背景的外观,并精确遵循目标姿势;(ii)泛化能力:模型应该能够泛化到未见过的人物主体、背景和姿势;(iii)组合性:应允许来自不同来源的已见/未见人物主体、背景和姿势的组合。为了解决这些挑战,我们引入了一种新颖的方法DISCO,其中包括一种新颖的模型架构,具有解耦控制以提高舞蹈合成的忠实性和组合性,以及一种有效的人类属性预训练,以更好地泛化到未见过的人类。广泛的定性和定量结果表明,DISCO能够生成具有多样外观和灵活动作的高质量人类舞蹈图像和视频。代码、演示、视频和可视化可在以下网址找到:https://disco-dance.github.io/。
English
Generative AI has made significant strides in computer vision, particularly in image/video synthesis conditioned on text descriptions. Despite the advancements, it remains challenging especially in the generation of human-centric content such as dance synthesis. Existing dance synthesis methods struggle with the gap between synthesized content and real-world dance scenarios. In this paper, we define a new problem setting: Referring Human Dance Generation, which focuses on real-world dance scenarios with three important properties: (i) Faithfulness: the synthesis should retain the appearance of both human subject foreground and background from the reference image, and precisely follow the target pose; (ii) Generalizability: the model should generalize to unseen human subjects, backgrounds, and poses; (iii) Compositionality: it should allow for composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce a novel approach, DISCO, which includes a novel model architecture with disentangled control to improve the faithfulness and compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DISCO can generate high-quality human dance images and videos with diverse appearances and flexible motions. Code, demo, video and visualization are available at: https://disco-dance.github.io/.
PDF252December 15, 2024