TeCH:文本引導的逼真服裝人類重建
TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
August 16, 2023
作者: Yangyi Huang, Hongwei Yi, Yuliang Xiu, Tingting Liao, Jiaxiang Tang, Deng Cai, Justus Thies
cs.AI
摘要
儘管最近在從單張圖像重建穿著衣物的研究取得了進展,但精確地還原帶有高級細節的「看不見的區域」仍然是一個缺乏關注且尚未解決的挑戰。現有方法通常會生成過於平滑的背面表面,帶有模糊的紋理。然而,如何有效地從單張圖像中捕捉個人的所有視覺特徵,以便重建看不見的區域(例如背面視圖)呢?受到基礎模型的威力的啟發,TeCH通過以下方式重建3D人體:1)利用描述性文本提示(例如服裝、顏色、髮型),這些提示是通過服裝解析模型和視覺問答(VQA)自動生成的,2)一個個性化微調的文本到圖像擴散模型(T2I),該模型學習了「難以描述」的外觀。為了以負擔得起的成本呈現高分辨率的3D穿著衣物的人體,我們提出了基於DMTet的混合3D表示,其中包括明確的身體形狀網格和隱式距離場。在描述性提示+個性化T2I擴散模型的指導下,通過多視角分數蒸餾採樣(SDS)和基於原始觀察的重建損失,優化了3D人體的幾何和紋理。TeCH生成了具有一致且精細紋理以及詳細全身幾何的高保真度3D穿著衣物的人體。定量和定性實驗表明,TeCH在重建準確性和渲染質量方面優於最先進的方法。代碼將公開提供供研究目的使用,網址為https://huangyangyi.github.io/tech
English
Despite recent research advancements in reconstructing clothed humans from a
single image, accurately restoring the "unseen regions" with high-level details
remains an unsolved challenge that lacks attention. Existing methods often
generate overly smooth back-side surfaces with a blurry texture. But how to
effectively capture all visual attributes of an individual from a single image,
which are sufficient to reconstruct unseen areas (e.g., the back view)?
Motivated by the power of foundation models, TeCH reconstructs the 3D human by
leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles)
which are automatically generated via a garment parsing model and Visual
Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion
model (T2I) which learns the "indescribable" appearance. To represent
high-resolution 3D clothed humans at an affordable cost, we propose a hybrid 3D
representation based on DMTet, which consists of an explicit body shape grid
and an implicit distance field. Guided by the descriptive prompts +
personalized T2I diffusion model, the geometry and texture of the 3D humans are
optimized through multi-view Score Distillation Sampling (SDS) and
reconstruction losses based on the original observation. TeCH produces
high-fidelity 3D clothed humans with consistent & delicate texture, and
detailed full-body geometry. Quantitative and qualitative experiments
demonstrate that TeCH outperforms the state-of-the-art methods in terms of
reconstruction accuracy and rendering quality. The code will be publicly
available for research purposes at https://huangyangyi.github.io/tech