ChatPaper.aiChatPaper

IMAGDressing-v1:可定制虛擬試衣間

IMAGDressing-v1: Customizable Virtual Dressing

July 17, 2024
作者: Fei Shen, Xin Jiang, Xin He, Hu Ye, Cong Wang, Xiaoyu Du, Zechao Li, Jinghui Tang
cs.AI

摘要

最新的進展通過使用潛在擴散模型進行局部服裝修補,實現了逼真的虛擬試穿(VTON),顯著提升了消費者的網購體驗。然而,現有的VTON技術忽略了商家全面展示服裝的需求,包括對服裝、可選臉部、姿勢和場景的靈活控制。為解決這一問題,我們定義了一個虛擬試穿(VD)任務,旨在生成帶有固定服裝和可選條件的可自由編輯人類圖像。同時,我們設計了一個全面的親和度指標指數(CAMI)來評估生成圖像與參考服裝之間的一致性。然後,我們提出了IMAGDressing-v1,該模型融合了從性特徵和VAE的紋理特徵的服裝UNet。我們提出了一個混合注意力模塊,包括凍結的自注意力和可訓練的交叉注意力,將服裝UNet中的服裝特徵整合到凍結去噪UNet中,確保用戶能夠通過文本控制不同場景。IMAGDressing-v1可以與其他擴展插件結合,例如ControlNet和IP-Adapter,以增強生成圖像的多樣性和可控性。此外,為解決數據不足的問題,我們發布了互動式服裝配對(IGPair)數據集,包含超過30萬對服裝和穿著圖像,並建立了一個數據組裝的標準流程。大量實驗表明,我們的IMAGDressing-v1在各種受控條件下實現了最先進的人類圖像合成性能。代碼和模型將在https://github.com/muzishen/IMAGDressing 上提供。
English
Latest advances have achieved realistic virtual try-on (VTON) through localized garment inpainting using latent diffusion models, significantly enhancing consumers' online shopping experience. However, existing VTON technologies neglect the need for merchants to showcase garments comprehensively, including flexible control over garments, optional faces, poses, and scenes. To address this issue, we define a virtual dressing (VD) task focused on generating freely editable human images with fixed garments and optional conditions. Meanwhile, we design a comprehensive affinity metric index (CAMI) to evaluate the consistency between generated images and reference garments. Then, we propose IMAGDressing-v1, which incorporates a garment UNet that captures semantic features from CLIP and texture features from VAE. We present a hybrid attention module, including a frozen self-attention and a trainable cross-attention, to integrate garment features from the garment UNet into a frozen denoising UNet, ensuring users can control different scenes through text. IMAGDressing-v1 can be combined with other extension plugins, such as ControlNet and IP-Adapter, to enhance the diversity and controllability of generated images. Furthermore, to address the lack of data, we release the interactive garment pairing (IGPair) dataset, containing over 300,000 pairs of clothing and dressed images, and establish a standard pipeline for data assembly. Extensive experiments demonstrate that our IMAGDressing-v1 achieves state-of-the-art human image synthesis performance under various controlled conditions. The code and model will be available at https://github.com/muzishen/IMAGDressing.

Summary

AI-Generated Summary

PDF132November 28, 2024