ChatPaper.aiChatPaper

TeCH:基于文本引导的逼真着装人体重建

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans

August 16, 2023
作者: Yangyi Huang, Hongwei Yi, Yuliang Xiu, Tingting Liao, Jiaxiang Tang, Deng Cai, Justus Thies
cs.AI

摘要

尽管近年来基于单张图像重建衣着人体的研究取得了进展,但如何精确还原具有高细节层次的"不可见区域"仍是一个缺乏关注且尚未解决的难题。现有方法往往生成过度平滑的背面结构与模糊纹理。那么,如何从单张图像中有效捕捉个体的全部视觉属性,足以重建不可见区域(如背部)呢?受基础模型强大能力的启发,TeCH通过以下方式实现三维人体重建:1)利用服装解析模型和视觉问答(VQA)自动生成的描述性文本提示(如服装款式、颜色、发型);2)采用经过个性化微调的文本到图像扩散模型(T2I)来学习"难以言述"的外观特征。为低成本呈现高分辨率三维衣着人体,我们提出基于DMTet的混合三维表征,该表征由显式人体形状网格和隐式距离场构成。在描述性提示与个性化T2I扩散模型的共同引导下,通过基于原始观测的多视角分数蒸馏采样(SDS)和重建损失,对三维人体的几何结构与纹理进行优化。TeCH能够生成具有连贯精细纹理及完整身体几何细节的高保真三维衣着人体。定量与定性实验表明,TeCH在重建精度与渲染质量方面均优于现有最优方法。相关代码将在https://huangyangyi.github.io/tech 公开供研究使用。
English
Despite recent research advancements in reconstructing clothed humans from a single image, accurately restoring the "unseen regions" with high-level details remains an unsolved challenge that lacks attention. Existing methods often generate overly smooth back-side surfaces with a blurry texture. But how to effectively capture all visual attributes of an individual from a single image, which are sufficient to reconstruct unseen areas (e.g., the back view)? Motivated by the power of foundation models, TeCH reconstructs the 3D human by leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles) which are automatically generated via a garment parsing model and Visual Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion model (T2I) which learns the "indescribable" appearance. To represent high-resolution 3D clothed humans at an affordable cost, we propose a hybrid 3D representation based on DMTet, which consists of an explicit body shape grid and an implicit distance field. Guided by the descriptive prompts + personalized T2I diffusion model, the geometry and texture of the 3D humans are optimized through multi-view Score Distillation Sampling (SDS) and reconstruction losses based on the original observation. TeCH produces high-fidelity 3D clothed humans with consistent & delicate texture, and detailed full-body geometry. Quantitative and qualitative experiments demonstrate that TeCH outperforms the state-of-the-art methods in terms of reconstruction accuracy and rendering quality. The code will be publicly available for research purposes at https://huangyangyi.github.io/tech
PDF353March 22, 2026