ChatPaper.aiChatPaper

InstantID:零-shot 恆定身份生成秒速完成

InstantID: Zero-shot Identity-Preserving Generation in Seconds

January 15, 2024
作者: Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chen
cs.AI

摘要

在個性化圖像合成方面取得了顯著進展,例如文本反轉、DreamBooth和LoRA等方法。然而,它們在真實應用中受到高存儲需求、冗長的微調過程以及需要多個參考圖像的限制。相反,現有的基於ID嵌入的方法,雖然僅需要單向推理,但面臨挑戰:它們要求在眾多模型參數上進行廣泛的微調,與社區預訓練模型不兼容,或無法保持高面部保真度。為解決這些限制,我們引入了InstantID,這是一種基於強大擴散模型的解決方案。我們的即插即用模塊能夠靈活處理各種風格的圖像個性化,僅使用單張面部圖像,同時確保高度保真度。為實現此目標,我們設計了一個新穎的IdentityNet,通過施加強大的語義和弱空間條件,將面部和標誌圖像與文本提示集成在一起,以引導圖像生成。InstantID展示了出色的性能和效率,在身份保護至關重要的實際應用中具有極大的好處。此外,我們的工作與流行的預訓練文本到圖像擴散模型(如SD1.5和SDXL)無縫集成,作為一個可適應的插件。我們的代碼和預訓練檢查點將在https://github.com/InstantID/InstantID 上提供。
English
There has been significant progress in personalized image synthesis with methods such as Textual Inversion, DreamBooth, and LoRA. Yet, their real-world applicability is hindered by high storage demands, lengthy fine-tuning processes, and the need for multiple reference images. Conversely, existing ID embedding-based methods, while requiring only a single forward inference, face challenges: they either necessitate extensive fine-tuning across numerous model parameters, lack compatibility with community pre-trained models, or fail to maintain high face fidelity. Addressing these limitations, we introduce InstantID, a powerful diffusion model-based solution. Our plug-and-play module adeptly handles image personalization in various styles using just a single facial image, while ensuring high fidelity. To achieve this, we design a novel IdentityNet by imposing strong semantic and weak spatial conditions, integrating facial and landmark images with textual prompts to steer the image generation. InstantID demonstrates exceptional performance and efficiency, proving highly beneficial in real-world applications where identity preservation is paramount. Moreover, our work seamlessly integrates with popular pre-trained text-to-image diffusion models like SD1.5 and SDXL, serving as an adaptable plugin. Our codes and pre-trained checkpoints will be available at https://github.com/InstantID/InstantID.
PDF588December 15, 2024