LHM:基於單張圖像的快速可動人體重建模型,秒級生成
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds
March 13, 2025
作者: Lingteng Qiu, Xiaodong Gu, Peihao Li, Qi Zuo, Weichao Shen, Junfei Zhang, Kejie Qiu, Weihao Yuan, Guanying Chen, Zilong Dong, Liefeng Bo
cs.AI
摘要
從單一圖像重建可動畫的3D人體模型是一個具有挑戰性的問題,這主要源於在解耦幾何形狀、外觀和變形時存在的不確定性。近年來,3D人體重建的進展主要集中在靜態人體建模上,而依賴於合成3D掃描數據進行訓練的做法限制了其泛化能力。相比之下,基於優化的視頻方法雖然能達到更高的保真度,但需要受控的捕捉條件和計算密集型的精細化處理過程。受到大型重建模型在高效靜態重建方面新興應用的啟發,我們提出了LHM(大型可動畫人體重建模型),以在前饋過程中推斷出以3D高斯濺射表示的高保真化身。我們的模型利用多模態Transformer架構,通過注意力機制有效編碼人體位置特徵和圖像特徵,從而實現對服裝幾何和紋理的細緻保留。為了進一步提升面部身份保留和細節恢復效果,我們提出了一種頭部特徵金字塔編碼方案,用於聚合頭部區域的多尺度特徵。大量實驗表明,我們的LHM能在數秒內生成逼真的可動畫人體,無需對面部和手部進行後處理,在重建精度和泛化能力上均優於現有方法。
English
Animatable 3D human reconstruction from a single image is a challenging
problem due to the ambiguity in decoupling geometry, appearance, and
deformation. Recent advances in 3D human reconstruction mainly focus on static
human modeling, and the reliance of using synthetic 3D scans for training
limits their generalization ability. Conversely, optimization-based video
methods achieve higher fidelity but demand controlled capture conditions and
computationally intensive refinement processes. Motivated by the emergence of
large reconstruction models for efficient static reconstruction, we propose LHM
(Large Animatable Human Reconstruction Model) to infer high-fidelity avatars
represented as 3D Gaussian splatting in a feed-forward pass. Our model
leverages a multimodal transformer architecture to effectively encode the human
body positional features and image features with attention mechanism, enabling
detailed preservation of clothing geometry and texture. To further boost the
face identity preservation and fine detail recovery, we propose a head feature
pyramid encoding scheme to aggregate multi-scale features of the head regions.
Extensive experiments demonstrate that our LHM generates plausible animatable
human in seconds without post-processing for face and hands, outperforming
existing methods in both reconstruction accuracy and generalization ability.Summary
AI-Generated Summary