DiffEditor:在基于扩散的图像编辑上提升准确性和灵活性
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing
February 4, 2024
作者: Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang
cs.AI
摘要
在过去几年中,大规模文本到图像(T2I)扩散模型彻底改变了图像生成领域。尽管具备多样且高质量的生成能力,但将这些能力转化为精细图像编辑仍然具有挑战性。本文提出了DiffEditor,旨在解决现有基于扩散的图像编辑中存在的两个弱点:(1)在复杂场景中,编辑结果常常缺乏编辑准确性并呈现意外的伪影;(2)缺乏协调编辑操作的灵活性,例如,想象新内容。在我们的解决方案中,我们引入了图像提示来进行精细图像编辑,与文本提示合作更好地描述编辑内容。为了增加灵活性同时保持内容一致性,我们将随机微分方程(SDE)局部组合到普通微分方程(ODE)采样中。此外,我们将区域评分为基础的梯度引导和时间旅行策略融入到扩散采样中,进一步提高了编辑质量。大量实验证明,我们的方法可以高效地在各种精细图像编辑任务上实现最先进的性能,包括在单个图像内进行编辑(例如,物体移动、调整大小和内容拖动)以及跨图像进行编辑(例如,替换外观和粘贴物体)。我们的源代码已发布在https://github.com/MC-E/DragonDiffusion。
English
Large-scale Text-to-Image (T2I) diffusion models have revolutionized image
generation over the last few years. Although owning diverse and high-quality
generation capabilities, translating these abilities to fine-grained image
editing remains challenging. In this paper, we propose DiffEditor to rectify
two weaknesses in existing diffusion-based image editing: (1) in complex
scenarios, editing results often lack editing accuracy and exhibit unexpected
artifacts; (2) lack of flexibility to harmonize editing operations, e.g.,
imagine new content. In our solution, we introduce image prompts in
fine-grained image editing, cooperating with the text prompt to better describe
the editing content. To increase the flexibility while maintaining content
consistency, we locally combine stochastic differential equation (SDE) into the
ordinary differential equation (ODE) sampling. In addition, we incorporate
regional score-based gradient guidance and a time travel strategy into the
diffusion sampling, further improving the editing quality. Extensive
experiments demonstrate that our method can efficiently achieve
state-of-the-art performance on various fine-grained image editing tasks,
including editing within a single image (e.g., object moving, resizing, and
content dragging) and across images (e.g., appearance replacing and object
pasting). Our source code is released at
https://github.com/MC-E/DragonDiffusion.