ChatPaper.aiChatPaper

DisCo:在現實世界中針對人類舞蹈生成的解耦控制

DisCo: Disentangled Control for Referring Human Dance Generation in Real World

June 30, 2023
作者: Tan Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang
cs.AI

摘要

生成式人工智慧在電腦視覺領域取得了重大進展,特別是在依賴於文字描述進行影像/影片合成方面。儘管有所進步,但在生成以人為中心的內容,如舞蹈合成方面仍然具有挑戰性。現有的舞蹈合成方法在合成內容與現實舞蹈場景之間存在困難。本文中,我們定義了一個新的問題設定:指代人類舞蹈生成,專注於具有三個重要特性的現實舞蹈場景:(i)忠實度:合成應保留參考圖像中的人物主體前景和背景的外觀,並準確遵循目標姿勢;(ii)泛化能力:模型應能泛化到未見過的人物主體、背景和姿勢;(iii)組合性:應允許從不同來源的已見/未見主體、背景和姿勢進行組合。為應對這些挑戰,我們提出了一種新方法,名為DISCO,其中包括一種新型模型架構,具有解耦控制以改善舞蹈合成的忠實度和組合性,以及一種有效的人類屬性預訓練,以更好地泛化到未見過的人物。大量的定性和定量結果表明,DISCO能夠生成外觀多樣、動作靈活的高質量人類舞蹈圖像和影片。代碼、演示、影片和可視化請參閱:https://disco-dance.github.io/。
English
Generative AI has made significant strides in computer vision, particularly in image/video synthesis conditioned on text descriptions. Despite the advancements, it remains challenging especially in the generation of human-centric content such as dance synthesis. Existing dance synthesis methods struggle with the gap between synthesized content and real-world dance scenarios. In this paper, we define a new problem setting: Referring Human Dance Generation, which focuses on real-world dance scenarios with three important properties: (i) Faithfulness: the synthesis should retain the appearance of both human subject foreground and background from the reference image, and precisely follow the target pose; (ii) Generalizability: the model should generalize to unseen human subjects, backgrounds, and poses; (iii) Compositionality: it should allow for composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce a novel approach, DISCO, which includes a novel model architecture with disentangled control to improve the faithfulness and compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DISCO can generate high-quality human dance images and videos with diverse appearances and flexible motions. Code, demo, video and visualization are available at: https://disco-dance.github.io/.
PDF252December 15, 2024