ChatPaper.aiChatPaper

Infinite-ID:透過ID語義實現保護身份的個性化 解耦範式

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

March 18, 2024
作者: Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li
cs.AI

摘要

借鑑最新的擴散模型在文本到圖像生成方面的進展,保持身份的個性化已經在準確捕捉特定身份方面取得了顯著進展,僅需一張參考圖像。然而,現有方法主要將參考圖像整合到文本嵌入空間中,導致圖像和文本信息的複雜交織,這對於保持身份忠實度和語義一致性提出了挑戰。為應對這一挑戰,我們提出了Infinite-ID,這是一種用於保持身份的個性化的ID-語義解耦範式。具體而言,我們引入了增強身份的訓練,將額外的圖像交叉注意力模組納入其中,以捕獲足夠的ID信息,同時停用擴散模型的原始文本交叉注意力模組。這確保圖像流忠實地呈現參考圖像提供的身份,同時減輕來自文本輸入的干擾。此外,我們引入了一個特徵交互機制,將混合注意力模組與AdaIN-mean操作相結合,無縫地融合兩個流。這個機制不僅增強了身份和語義一致性的忠實度,還能方便地控制生成圖像的風格。對原始照片生成和風格圖像生成的大量實驗結果證明了我們提出的方法的優越性能。
English
Drawing on recent advancements in diffusion models for text-to-image generation, identity-preserved personalization has made significant progress in accurately capturing specific identities with just a single reference image. However, existing methods primarily integrate reference images within the text embedding space, leading to a complex entanglement of image and text information, which poses challenges for preserving both identity fidelity and semantic consistency. To tackle this challenge, we propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization. Specifically, we introduce identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information while deactivating the original text cross-attention module of the diffusion model. This ensures that the image stream faithfully represents the identity provided by the reference image while mitigating interference from textual input. Additionally, we introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams. This mechanism not only enhances the fidelity of identity and semantic consistency but also enables convenient control over the styles of the generated images. Extensive experimental results on both raw photo generation and style image generation demonstrate the superior performance of our proposed method.

Summary

AI-Generated Summary

PDF202December 15, 2024