ChatPaper.aiChatPaper

无限-ID:通过ID语义解耦实现保护身份的个性化定制模式

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

March 18, 2024
作者: Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li
cs.AI

摘要

借鉴最新的扩散模型在文本到图像生成方面的进展,保留身份的个性化已经在准确捕捉特定身份方面取得了显著进展,仅需一张参考图像。然而,现有方法主要将参考图像整合到文本嵌入空间中,导致图像和文本信息的复杂交织,这给保持身份忠实度和语义一致性带来了挑战。为了解决这一挑战,我们提出Infinite-ID,这是一个用于保留身份的个性化的ID-语义解耦范式。具体而言,我们引入了增强身份的训练,将额外的图像交叉注意力模块纳入其中,以捕获足够的ID信息,同时停用扩散模型的原始文本交叉注意力模块。这确保了图像流忠实地代表了参考图像提供的身份,同时减轻了来自文本输入的干扰。此外,我们引入了一个特征交互机制,将混合注意力模块与AdaIN-mean操作相结合,无缝地融合了这两个流。这种机制不仅增强了身份的忠实度和语义一致性,还能方便地控制生成图像的风格。对原始照片生成和风格图像生成的大量实验结果表明了我们提出的方法的卓越性能。
English
Drawing on recent advancements in diffusion models for text-to-image generation, identity-preserved personalization has made significant progress in accurately capturing specific identities with just a single reference image. However, existing methods primarily integrate reference images within the text embedding space, leading to a complex entanglement of image and text information, which poses challenges for preserving both identity fidelity and semantic consistency. To tackle this challenge, we propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization. Specifically, we introduce identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information while deactivating the original text cross-attention module of the diffusion model. This ensures that the image stream faithfully represents the identity provided by the reference image while mitigating interference from textual input. Additionally, we introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams. This mechanism not only enhances the fidelity of identity and semantic consistency but also enables convenient control over the styles of the generated images. Extensive experimental results on both raw photo generation and style image generation demonstrate the superior performance of our proposed method.

Summary

AI-Generated Summary

PDF202December 15, 2024