ChatPaper.aiChatPaper

DisCo:面向现实世界的参照人体舞蹈生成解耦控制

DisCo: Disentangled Control for Referring Human Dance Generation in Real World

June 30, 2023
作者: Tan Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang
cs.AI

摘要

生成式AI在计算机视觉领域取得了显著进展,尤其在基于文本描述的图像/视频合成方面。尽管如此,在生成以人为中心的内容(如舞蹈合成)时仍面临挑战。现有舞蹈合成方法难以弥合合成内容与真实舞蹈场景之间的差距。本文提出新问题设定:指代式人类舞蹈生成,该设定聚焦具有三个关键特性的真实舞蹈场景:(i)忠实性:合成内容需保留参考图像中人物前景与背景的外观特征,并精确遵循目标姿态;(ii)泛化性:模型应能泛化至未见过的对象、背景及姿态;(iii)组合性:需支持对不同来源的已见/未见对象、背景及姿态进行组合。针对这些挑战,我们提出创新方法DISCO,其创新性包括:采用解耦控制的新型模型架构以提升舞蹈合成的忠实度与组合性,以及通过高效的人物属性预训练增强对未见对象的泛化能力。大量定性与定量结果表明,DISCO能生成具有多样化外观和灵活运动的高质量人类舞蹈图像及视频。代码、演示、视频及可视化结果详见:https://disco-dance.github.io/。
English
Generative AI has made significant strides in computer vision, particularly in image/video synthesis conditioned on text descriptions. Despite the advancements, it remains challenging especially in the generation of human-centric content such as dance synthesis. Existing dance synthesis methods struggle with the gap between synthesized content and real-world dance scenarios. In this paper, we define a new problem setting: Referring Human Dance Generation, which focuses on real-world dance scenarios with three important properties: (i) Faithfulness: the synthesis should retain the appearance of both human subject foreground and background from the reference image, and precisely follow the target pose; (ii) Generalizability: the model should generalize to unseen human subjects, backgrounds, and poses; (iii) Compositionality: it should allow for composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce a novel approach, DISCO, which includes a novel model architecture with disentangled control to improve the faithfulness and compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DISCO can generate high-quality human dance images and videos with diverse appearances and flexible motions. Code, demo, video and visualization are available at: https://disco-dance.github.io/.
PDF252December 15, 2024