ChatPaper.aiChatPaper

OOTDiffusion:基於融合的潛在擴散技術,用於可控虛擬試穿

OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

March 4, 2024
作者: Yuhao Xu, Tao Gu, Weifeng Chen, Chengcai Chen
cs.AI

摘要

基於圖像的虛擬試穿(VTON)旨在生成一幅穿著商店內衣物的目標人物圖像,這是一項具有挑戰性的圖像合成任務,不僅要求穿著的人物具有高度逼真度,還要完整保留衣物細節。為應對這個問題,我們提出了一種名為「穿著超過試穿擴散」(OOTDiffusion)的方法,利用預訓練的潛在擴散模型的能力,並設計了一種新穎的網絡架構,用於實現逼真且可控的虛擬試穿。在沒有明確變形過程的情況下,我們提出了一種穿著 UNet 來學習衣物細節特徵,並通過我們提出的穿著融合,在擴散模型的去噪過程中將其與目標人體合併。為了進一步增強我們穿著 UNet 的可控性,我們在訓練過程中引入了穿著 dropout,這使我們能夠通過無分類器的引導調整衣物特徵的強度。我們在 VITON-HD 和 Dress Code 數據集上進行了全面的實驗,結果表明 OOTDiffusion 能夠高效生成任意人物和衣物圖像的高質量穿著圖像,優於其他 VTON 方法,無論是逼真度還是可控性,顯示出虛擬試穿領域的一個令人印象深刻的突破。我們的源代碼可在 https://github.com/levihsu/OOTDiffusion 上獲取。
English
Image-based virtual try-on (VTON), which aims to generate an outfitted image of a target human wearing an in-shop garment, is a challenging image-synthesis task calling for not only high fidelity of the outfitted human but also full preservation of garment details. To tackle this issue, we propose Outfitting over Try-on Diffusion (OOTDiffusion), leveraging the power of pretrained latent diffusion models and designing a novel network architecture for realistic and controllable virtual try-on. Without an explicit warping process, we propose an outfitting UNet to learn the garment detail features, and merge them with the target human body via our proposed outfitting fusion in the denoising process of diffusion models. In order to further enhance the controllability of our outfitting UNet, we introduce outfitting dropout to the training process, which enables us to adjust the strength of garment features through classifier-free guidance. Our comprehensive experiments on the VITON-HD and Dress Code datasets demonstrate that OOTDiffusion efficiently generates high-quality outfitted images for arbitrary human and garment images, which outperforms other VTON methods in both fidelity and controllability, indicating an impressive breakthrough in virtual try-on. Our source code is available at https://github.com/levihsu/OOTDiffusion.
PDF312December 15, 2024