标题:Champ:具有3D参数引导的可控且一致的人类图像动画
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
March 21, 2024
作者: Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu
cs.AI
摘要
在本研究中,我们介绍了一种人体图像动画的方法论,通过在潜在扩散框架中利用3D人体参数模型,以增强当前人体生成技术中的形状对齐和运动引导。该方法利用SMPL(Skinned Multi-Person Linear)模型作为3D人体参数模型,以建立身体形状和姿势的统一表示。这有助于准确捕捉源视频中复杂的人体几何和运动特征。具体而言,我们结合了从SMPL序列获得的渲染深度图像、法线图和语义图,以及基于骨骼的运动引导,丰富了潜在扩散模型的条件,具备全面的3D形状和详细的姿势属性。采用多层运动融合模块,集成了自注意机制,用于在空间域中融合形状和运动潜在表示。通过将3D人体参数模型表示为运动引导,我们可以在参考图像和源视频运动之间执行参数化形状对齐的人体。在基准数据集上进行的实验评估表明,该方法具有生成高质量人体动画的卓越能力,能够准确捕捉姿势和形状变化。此外,我们的方法还展现了对所提出的野外数据集具有卓越的泛化能力。项目页面:https://fudan-generative-vision.github.io/champ。
English
In this study, we introduce a methodology for human image animation by
leveraging a 3D human parametric model within a latent diffusion framework to
enhance shape alignment and motion guidance in curernt human generative
techniques. The methodology utilizes the SMPL(Skinned Multi-Person Linear)
model as the 3D human parametric model to establish a unified representation of
body shape and pose. This facilitates the accurate capture of intricate human
geometry and motion characteristics from source videos. Specifically, we
incorporate rendered depth images, normal maps, and semantic maps obtained from
SMPL sequences, alongside skeleton-based motion guidance, to enrich the
conditions to the latent diffusion model with comprehensive 3D shape and
detailed pose attributes. A multi-layer motion fusion module, integrating
self-attention mechanisms, is employed to fuse the shape and motion latent
representations in the spatial domain. By representing the 3D human parametric
model as the motion guidance, we can perform parametric shape alignment of the
human body between the reference image and the source video motion.
Experimental evaluations conducted on benchmark datasets demonstrate the
methodology's superior ability to generate high-quality human animations that
accurately capture both pose and shape variations. Furthermore, our approach
also exhibits superior generalization capabilities on the proposed wild
dataset. Project page: https://fudan-generative-vision.github.io/champ.Summary
AI-Generated Summary