混合展示:用於擴散模型多概念定制的分散低秩適應
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
May 29, 2023
作者: Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou
cs.AI
摘要
公開的大規模文本到圖像擴散模型,如穩定擴散,已經引起社群的廣泛關注。這些模型可以輕鬆透過低秩適應(LoRAs)進行新概念的定制。然而,利用多個概念LoRAs來共同支持多個定制概念提出了挑戰。我們將這種情況稱為分散式多概念定制,其中包括單客戶概念調整和中心節點概念融合。在本文中,我們提出了一個名為Mix-of-Show的新框架,解決了分散式多概念定制的挑戰,包括由現有單客戶LoRA調整引起的概念衝突和模型融合期間的身份丟失。Mix-of-Show採用了嵌入分解LoRA(ED-LoRA)進行單客戶調整,並採用梯度融合用於中心節點,以保留單一概念的領域本質,並支持理論上無限的概念融合。此外,我們引入了區域可控取樣,擴展了空間可控取樣(例如ControlNet和T2I-Adaptor)以解決多概念取樣中的屬性綁定和缺失對象問題。大量實驗表明,Mix-of-Show能夠以高度忠實地合成多個定制概念,包括字符、物體和場景。
English
Public large-scale text-to-image diffusion models, such as Stable Diffusion,
have gained significant attention from the community. These models can be
easily customized for new concepts using low-rank adaptations (LoRAs). However,
the utilization of multiple concept LoRAs to jointly support multiple
customized concepts presents a challenge. We refer to this scenario as
decentralized multi-concept customization, which involves single-client concept
tuning and center-node concept fusion. In this paper, we propose a new
framework called Mix-of-Show that addresses the challenges of decentralized
multi-concept customization, including concept conflicts resulting from
existing single-client LoRA tuning and identity loss during model fusion.
Mix-of-Show adopts an embedding-decomposed LoRA (ED-LoRA) for single-client
tuning and gradient fusion for the center node to preserve the in-domain
essence of single concepts and support theoretically limitless concept fusion.
Additionally, we introduce regionally controllable sampling, which extends
spatially controllable sampling (e.g., ControlNet and T2I-Adaptor) to address
attribute binding and missing object problems in multi-concept sampling.
Extensive experiments demonstrate that Mix-of-Show is capable of composing
multiple customized concepts with high fidelity, including characters, objects,
and scenes.