TeCH:文本引导的逼真服装人体重建
TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
August 16, 2023
作者: Yangyi Huang, Hongwei Yi, Yuliang Xiu, Tingting Liao, Jiaxiang Tang, Deng Cai, Justus Thies
cs.AI
摘要
尽管最近在从单个图像重建穿着衣服的人方面取得了研究进展,但准确恢复具有高级细节的“未见区域”仍然是一个缺乏关注且尚未解决的挑战。现有方法通常会生成具有模糊纹理的过度平滑的背面表面。但如何有效地从单个图像中捕捉个体的所有视觉属性,以便重建未见区域(例如背面视图)呢?受基础模型强大性的启发,TeCH通过以下方式重建3D人体:1)利用通过服装解析模型和视觉问答(VQA)自动生成的描述性文本提示(例如服装、颜色、发型),2)利用个性化微调的文本到图像扩散模型(T2I)学习“难以描述”的外观。为了以较低成本表示高分辨率的穿着衣服的3D人体,我们提出了基于DMTet的混合3D表示,其中包括显式身体形状网格和隐式距离场。在描述性提示+个性化T2I扩散模型的指导下,通过多视图分数蒸馏采样(SDS)和基于原始观察的重建损失,优化了3D人体的几何和纹理。TeCH生成具有一致且精致纹理以及详细全身几何的高保真度3D穿着衣服的人体。定量和定性实验表明,TeCH在重建准确性和渲染质量方面优于现有方法。该代码将公开提供供研究目的使用,网址为https://huangyangyi.github.io/tech。
English
Despite recent research advancements in reconstructing clothed humans from a
single image, accurately restoring the "unseen regions" with high-level details
remains an unsolved challenge that lacks attention. Existing methods often
generate overly smooth back-side surfaces with a blurry texture. But how to
effectively capture all visual attributes of an individual from a single image,
which are sufficient to reconstruct unseen areas (e.g., the back view)?
Motivated by the power of foundation models, TeCH reconstructs the 3D human by
leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles)
which are automatically generated via a garment parsing model and Visual
Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion
model (T2I) which learns the "indescribable" appearance. To represent
high-resolution 3D clothed humans at an affordable cost, we propose a hybrid 3D
representation based on DMTet, which consists of an explicit body shape grid
and an implicit distance field. Guided by the descriptive prompts +
personalized T2I diffusion model, the geometry and texture of the 3D humans are
optimized through multi-view Score Distillation Sampling (SDS) and
reconstruction losses based on the original observation. TeCH produces
high-fidelity 3D clothed humans with consistent & delicate texture, and
detailed full-body geometry. Quantitative and qualitative experiments
demonstrate that TeCH outperforms the state-of-the-art methods in terms of
reconstruction accuracy and rendering quality. The code will be publicly
available for research purposes at https://huangyangyi.github.io/tech