ChatPaper.aiChatPaper

LEDITS:使用DDPM反演和语义引导进行真实图像编辑

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

July 2, 2023
作者: Linoy Tsaban, Apolinário Passos
cs.AI

摘要

最近的大规模文本引导扩散模型提供了强大的图像生成能力。目前,人们正在大力努力实现使用文本对这些图像进行修改,以提供直观和多功能的编辑方式。然而,由于编辑技术的固有特性涉及保留原始图像的某些内容,编辑对这些生成模型来说是困难的。相反,在基于文本的模型中,即使对文本提示进行轻微修改,也经常会导致完全不同的结果,使得准确对应用户意图的一次性生成变得极具挑战性。此外,要使用这些最先进工具编辑真实图像,必须首先将图像反转为预训练模型的领域,这会影响编辑质量和延迟。在这份探索性报告中,我们提出了LEDITS - 一种结合了适用于真实图像编辑的Edit Friendly DDPM反转技术和语义引导的轻量级方法,从而将语义引导扩展到真实图像编辑,同时利用DDPM反转的编辑能力。这种方法实现了多功能的编辑,包括微妙和广泛的修改,以及构图和风格的变化,而无需优化或对架构进行扩展。
English
Recent large-scale text-guided diffusion models provide powerful image-generation capabilities. Currently, a significant effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. However, editing proves to be difficult for these generative models due to the inherent nature of editing techniques, which involves preserving certain content from the original image. Conversely, in text-based models, even minor modifications to the text prompt frequently result in an entirely distinct result, making attaining one-shot generation that accurately corresponds to the users intent exceedingly challenging. In addition, to edit a real image using these state-of-the-art tools, one must first invert the image into the pre-trained models domain - adding another factor affecting the edit quality, as well as latency. In this exploratory report, we propose LEDITS - a combined lightweight approach for real-image editing, incorporating the Edit Friendly DDPM inversion technique with Semantic Guidance, thus extending Semantic Guidance to real image editing, while harnessing the editing capabilities of DDPM inversion as well. This approach achieves versatile edits, both subtle and extensive as well as alterations in composition and style, while requiring no optimization nor extensions to the architecture.
PDF321December 15, 2024