SCEdit:透過跳躍連接編輯實現高效且可控的圖像擴散生成
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
December 18, 2023
作者: Zeyinzi Jiang, Chaojie Mao, Yulin Pan, Zhen Han, Jingfeng Zhang
cs.AI
摘要
影像擴散模型已被應用於各種任務,如文本到影像生成和可控影像合成。最近的研究引入了微調方法,對原始模型進行微小調整,在基礎生成式擴散模型的特定適應中取得了有希望的結果。我們並未修改擴散模型的主幹,而是深入探討 U-Net 中 skip connection 的作用,揭示了跨編碼器和解碼器聚合遠距信息的分層特徵對影像生成的內容和質量產生重大影響。基於這一觀察,我們提出了一個高效的生成調整框架,名為 SCEdit,該框架集成並編輯 Skip Connection,使用一個輕量級調整模塊命名為 SC-Tuner。此外,所提出的框架允許通過注入不同條件與 Controllable SC-Tuner 進行簡化和統一網絡設計,從而輕鬆擴展到可控影像合成。我們的 SCEdit 顯著減少了訓練參數、內存使用量和計算開銷,這是由於其輕量級調整器,僅將反向傳播傳遞到解碼器塊。在文本到影像生成和可控影像合成任務上進行的大量實驗證明了我們的方法在效率和性能方面的優越性。項目頁面:https://scedit.github.io/
English
Image diffusion models have been utilized in various tasks, such as
text-to-image generation and controllable image synthesis. Recent research has
introduced tuning methods that make subtle adjustments to the original models,
yielding promising results in specific adaptations of foundational generative
diffusion models. Rather than modifying the main backbone of the diffusion
model, we delve into the role of skip connection in U-Net and reveal that
hierarchical features aggregating long-distance information across encoder and
decoder make a significant impact on the content and quality of image
generation. Based on the observation, we propose an efficient generative tuning
framework, dubbed SCEdit, which integrates and edits Skip Connection using a
lightweight tuning module named SC-Tuner. Furthermore, the proposed framework
allows for straightforward extension to controllable image synthesis by
injecting different conditions with Controllable SC-Tuner, simplifying and
unifying the network design for multi-condition inputs. Our SCEdit
substantially reduces training parameters, memory usage, and computational
expense due to its lightweight tuners, with backward propagation only passing
to the decoder blocks. Extensive experiments conducted on text-to-image
generation and controllable image synthesis tasks demonstrate the superiority
of our method in terms of efficiency and performance. Project page:
https://scedit.github.io/