ChatPaper.aiChatPaper

MagicMan:具有三维感知扩散和迭代细化的人类生成小说视图合成

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

August 26, 2024
作者: Xu He, Xiaoyu Li, Di Kang, Jiangnan Ye, Chaopeng Zhang, Liyang Chen, Xiangjun Gao, Han Zhang, Zhiyong Wu, Haolin Zhuang
cs.AI

摘要

现有的单图像人体重建工作由于训练数据不足或三维不一致性而缺乏强大的泛化能力,缺乏全面的多视角知识。在本文中,我们介绍了MagicMan,这是一个专门针对人体的多视角扩散模型,旨在从单个参考图像生成高质量的新视角图像。作为其核心,我们利用预训练的二维扩散模型作为生成先验以实现泛化能力,同时使用参数化的SMPL-X模型作为三维身体先验以促进三维意识。为了解决在实现改进的三维人体重建时保持一致性的关键挑战,我们首先引入了混合多视角注意力,以促进不同视角之间的高效和全面信息交换。此外,我们提出了一种几何感知的双分支,以在RGB和法线域同时进行生成,通过几何线索进一步增强一致性。最后,为了解决由于SMPL-X估计不准确而与参考图像冲突而引起的不规则问题,我们提出了一种新颖的迭代细化策略,逐渐优化SMPL-X的准确性,同时提高生成的多视角的质量和一致性。大量实验结果表明,我们的方法在新视角合成和随后的三维人体重建任务中明显优于现有方法。
English
Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge. In this paper, we introduce MagicMan, a human-specific multi-view diffusion model designed to generate high-quality novel view images from a single reference image. As its core, we leverage a pre-trained 2D diffusion model as the generative prior for generalizability, with the parametric SMPL-X model as the 3D body prior to promote 3D awareness. To tackle the critical challenge of maintaining consistency while achieving dense multi-view generation for improved 3D human reconstruction, we first introduce hybrid multi-view attention to facilitate both efficient and thorough information interchange across different views. Additionally, we present a geometry-aware dual branch to perform concurrent generation in both RGB and normal domains, further enhancing consistency via geometry cues. Last but not least, to address ill-shaped issues arising from inaccurate SMPL-X estimation that conflicts with the reference image, we propose a novel iterative refinement strategy, which progressively optimizes SMPL-X accuracy while enhancing the quality and consistency of the generated multi-views. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in both novel view synthesis and subsequent 3D human reconstruction tasks.

Summary

AI-Generated Summary

PDF112November 16, 2024