ChatPaper.aiChatPaper

MagicMan:具有三維感知擴散和迭代細化的人類生成小說視圖合成

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

August 26, 2024
作者: Xu He, Xiaoyu Li, Di Kang, Jiangnan Ye, Chaopeng Zhang, Liyang Chen, Xiangjun Gao, Han Zhang, Zhiyong Wu, Haolin Zhuang
cs.AI

摘要

現有的單張圖像人體重建作品由於訓練數據不足或三維不一致,導致泛化能力較弱,缺乏全面的多視角知識。本文介紹了MagicMan,一個針對人體的多視角擴散模型,旨在從單張參考圖像生成高質量的新視角圖像。在其核心,我們利用預先訓練的二維擴散模型作為生成先驗以提高泛化能力,並使用SMPL-X模型作為三維身體先驗以促進三維意識。為了應對在實現改善三維人體重建的密集多視角生成的同時保持一致性的關鍵挑戰,我們首先引入了混合多視角注意力,以促進不同視角之間的高效和全面信息交換。此外,我們提出了一個幾何感知的雙分支,同時在RGB和法向域中執行並進行生成,通過幾何線索進一步增強一致性。最後,為了解決由於不準確的SMPL-X估計而與參考圖像相衝突而產生的不良形狀問題,我們提出了一種新穎的迭代細化策略,逐步優化SMPL-X的準確性,同時提高生成的多視角的質量和一致性。大量實驗結果表明,我們的方法在新視角合成和隨後的三維人體重建任務中顯著優於現有方法。
English
Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge. In this paper, we introduce MagicMan, a human-specific multi-view diffusion model designed to generate high-quality novel view images from a single reference image. As its core, we leverage a pre-trained 2D diffusion model as the generative prior for generalizability, with the parametric SMPL-X model as the 3D body prior to promote 3D awareness. To tackle the critical challenge of maintaining consistency while achieving dense multi-view generation for improved 3D human reconstruction, we first introduce hybrid multi-view attention to facilitate both efficient and thorough information interchange across different views. Additionally, we present a geometry-aware dual branch to perform concurrent generation in both RGB and normal domains, further enhancing consistency via geometry cues. Last but not least, to address ill-shaped issues arising from inaccurate SMPL-X estimation that conflicts with the reference image, we propose a novel iterative refinement strategy, which progressively optimizes SMPL-X accuracy while enhancing the quality and consistency of the generated multi-views. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in both novel view synthesis and subsequent 3D human reconstruction tasks.

Summary

AI-Generated Summary

PDF112November 16, 2024