DiffEditor:提升擴散式圖像編輯的準確性和靈活性
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing
February 4, 2024
作者: Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang
cs.AI
摘要
近幾年來,大規模文本到圖像(T2I)擴散模型已經在圖像生成領域引起了革命。儘管具有多樣且高質量的生成能力,將這些能力應用於精細的圖像編輯仍然具有挑戰性。本文提出了DiffEditor,以糾正現有基於擴散的圖像編輯中的兩個弱點:(1)在複雜情境中,編輯結果往往缺乏編輯準確性並呈現意外的瑕疵;(2)缺乏協調編輯操作的靈活性,例如,想像新內容。在我們的解決方案中,我們引入了圖像提示在精細的圖像編輯中,與文本提示合作以更好地描述編輯內容。為了提高靈活性並保持內容一致性,我們將隨機微分方程(SDE)局部結合到常微分方程(ODE)採樣中。此外,我們將區域分數基於梯度引導和時間旅行策略融入擴散採樣,進一步提高編輯質量。大量實驗表明,我們的方法可以有效地在各種精細的圖像編輯任務上實現最先進的性能,包括在單張圖像內進行編輯(例如,物體移動、調整大小和拖動內容)以及跨圖像進行編輯(例如,替換外觀和粘貼物體)。我們的源代碼已發布在 https://github.com/MC-E/DragonDiffusion。
English
Large-scale Text-to-Image (T2I) diffusion models have revolutionized image
generation over the last few years. Although owning diverse and high-quality
generation capabilities, translating these abilities to fine-grained image
editing remains challenging. In this paper, we propose DiffEditor to rectify
two weaknesses in existing diffusion-based image editing: (1) in complex
scenarios, editing results often lack editing accuracy and exhibit unexpected
artifacts; (2) lack of flexibility to harmonize editing operations, e.g.,
imagine new content. In our solution, we introduce image prompts in
fine-grained image editing, cooperating with the text prompt to better describe
the editing content. To increase the flexibility while maintaining content
consistency, we locally combine stochastic differential equation (SDE) into the
ordinary differential equation (ODE) sampling. In addition, we incorporate
regional score-based gradient guidance and a time travel strategy into the
diffusion sampling, further improving the editing quality. Extensive
experiments demonstrate that our method can efficiently achieve
state-of-the-art performance on various fine-grained image editing tasks,
including editing within a single image (e.g., object moving, resizing, and
content dragging) and across images (e.g., appearance replacing and object
pasting). Our source code is released at
https://github.com/MC-E/DragonDiffusion.