任意轉任意試穿:利用適應性位置嵌入進行多功能虛擬服裝任務
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
January 27, 2025
作者: Hailong Guo, Bohan Zeng, Yiren Song, Wentao Zhang, Chuang Zhang, Jiaming Liu
cs.AI
摘要
基於圖像的虛擬試穿(VTON)旨在通過將輸入的服裝轉移到目標人物的圖像上生成虛擬試穿結果。然而,由於缺乏配對的服裝-模型數據,現有方法很難在VTON中實現高通用性和質量。這也限制了生成無遮罩試穿的能力。為了應對數據稀缺問題,方法如穩定服裝和MMTryon使用合成數據策略,有效地增加了模型端的配對數據量。然而,現有方法通常僅限於執行特定的試穿任務,並且缺乏用戶友好性。為了增強VTON生成的通用性和可控性,我們提出了Any2AnyTryon,它可以根據不同的文本指令和模型服裝圖像生成試穿結果,以滿足各種需求,消除對遮罩、姿勢或其他條件的依賴。具體來說,我們首先構建了虛擬試穿數據集LAION-Garment,這是已知的最大開源服裝試穿數據集。然後,我們引入了自適應位置嵌入,使模型能夠基於不同尺寸和類別的輸入圖像生成滿意的穿著模型圖像或服裝圖像,從而顯著增強了VTON生成的通用性和可控性。在我們的實驗中,我們展示了Any2AnyTryon的有效性並將其與現有方法進行了比較。結果表明,Any2AnyTryon實現了靈活、可控和高質量的基於圖像的虛擬試穿生成。
English
Image-based virtual try-on (VTON) aims to generate a virtual try-on result by
transferring an input garment onto a target person's image. However, the
scarcity of paired garment-model data makes it challenging for existing methods
to achieve high generalization and quality in VTON. Also, it limits the ability
to generate mask-free try-ons. To tackle the data scarcity problem, approaches
such as Stable Garment and MMTryon use a synthetic data strategy, effectively
increasing the amount of paired data on the model side. However, existing
methods are typically limited to performing specific try-on tasks and lack
user-friendliness. To enhance the generalization and controllability of VTON
generation, we propose Any2AnyTryon, which can generate try-on results based on
different textual instructions and model garment images to meet various needs,
eliminating the reliance on masks, poses, or other conditions. Specifically, we
first construct the virtual try-on dataset LAION-Garment, the largest known
open-source garment try-on dataset. Then, we introduce adaptive position
embedding, which enables the model to generate satisfactory outfitted model
images or garment images based on input images of different sizes and
categories, significantly enhancing the generalization and controllability of
VTON generation. In our experiments, we demonstrate the effectiveness of our
Any2AnyTryon and compare it with existing methods. The results show that
Any2AnyTryon enables flexible, controllable, and high-quality image-based
virtual try-on generation.https://logn-2024.github.io/Any2anyTryonProjectPage/Summary
AI-Generated Summary