Champ:具有3D參數引導的可控且一致的人類圖像動畫
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
March 21, 2024
作者: Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu
cs.AI
摘要
在這項研究中,我們介紹了一種人類圖像動畫的方法,通過在潛在擴散框架內利用 3D 人體參數模型來增強當前人類生成技術中的形狀對齊和運動引導。該方法利用 SMPL(Skinned Multi-Person Linear)模型作為 3D 人體參數模型,以建立身體形狀和姿勢的統一表示。這有助於從源視頻準確捕捉複雜的人體幾何形狀和運動特徵。具體來說,我們結合從 SMPL 序列獲得的渲染深度圖像、法向圖和語義圖,以及基於骨架的運動引導,豐富了潛在擴散模型的條件,具有全面的 3D 形狀和詳細的姿勢特徵。採用多層運動融合模塊,集成自注意機制,用於在空間域中融合形狀和運動潛在表示。通過將 3D 人體參數模型表示為運動引導,我們可以在參考圖像和源視頻運動之間執行人體的參數形狀對齊。在基準數據集上進行的實驗評估表明,該方法能夠生成高質量的人類動畫,準確捕捉姿勢和形狀變化。此外,我們的方法在提出的 wild 數據集上還展現出優越的泛化能力。項目頁面:https://fudan-generative-vision.github.io/champ。
English
In this study, we introduce a methodology for human image animation by
leveraging a 3D human parametric model within a latent diffusion framework to
enhance shape alignment and motion guidance in curernt human generative
techniques. The methodology utilizes the SMPL(Skinned Multi-Person Linear)
model as the 3D human parametric model to establish a unified representation of
body shape and pose. This facilitates the accurate capture of intricate human
geometry and motion characteristics from source videos. Specifically, we
incorporate rendered depth images, normal maps, and semantic maps obtained from
SMPL sequences, alongside skeleton-based motion guidance, to enrich the
conditions to the latent diffusion model with comprehensive 3D shape and
detailed pose attributes. A multi-layer motion fusion module, integrating
self-attention mechanisms, is employed to fuse the shape and motion latent
representations in the spatial domain. By representing the 3D human parametric
model as the motion guidance, we can perform parametric shape alignment of the
human body between the reference image and the source video motion.
Experimental evaluations conducted on benchmark datasets demonstrate the
methodology's superior ability to generate high-quality human animations that
accurately capture both pose and shape variations. Furthermore, our approach
also exhibits superior generalization capabilities on the proposed wild
dataset. Project page: https://fudan-generative-vision.github.io/champ.Summary
AI-Generated Summary