ChatPaper.aiChatPaper

LEDITS:使用DDPM反演和語義引導的真實圖像編輯

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

July 2, 2023
作者: Linoy Tsaban, Apolinário Passos
cs.AI

摘要

最近大規模文本引導擴散模型提供了強大的圖像生成能力。目前,人們正在大力努力實現通過僅使用文本來修改這些圖像,以提供直觀且多功能的編輯。然而,由於編輯技術的固有特性,涉及保留原始圖像中某些內容,對這些生成模型來說編輯證明是困難的。相反,在基於文本的模型中,即使對文本提示進行輕微修改,也經常導致完全不同的結果,這使得準確符合用戶意圖的一次性生成變得極具挑戰性。此外,要使用這些最先進工具來編輯真實圖像,必須首先將圖像反轉為預先訓練模型的領域 - 這增加了影響編輯質量以及延遲的另一因素。在這份探索性報告中,我們提出了LEDITS - 一種結合輕量級方法進行真實圖像編輯,將Edit Friendly DDPM反轉技術與語義引導相結合,從而將語義引導擴展到真實圖像編輯,同時利用DDPM反轉的編輯能力。這種方法實現了多功能編輯,包括微妙和廣泛的修改,以及構圖和風格的變化,而無需對架構進行優化或擴展。
English
Recent large-scale text-guided diffusion models provide powerful image-generation capabilities. Currently, a significant effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. However, editing proves to be difficult for these generative models due to the inherent nature of editing techniques, which involves preserving certain content from the original image. Conversely, in text-based models, even minor modifications to the text prompt frequently result in an entirely distinct result, making attaining one-shot generation that accurately corresponds to the users intent exceedingly challenging. In addition, to edit a real image using these state-of-the-art tools, one must first invert the image into the pre-trained models domain - adding another factor affecting the edit quality, as well as latency. In this exploratory report, we propose LEDITS - a combined lightweight approach for real-image editing, incorporating the Edit Friendly DDPM inversion technique with Semantic Guidance, thus extending Semantic Guidance to real image editing, while harnessing the editing capabilities of DDPM inversion as well. This approach achieves versatile edits, both subtle and extensive as well as alterations in composition and style, while requiring no optimization nor extensions to the architecture.
PDF321December 15, 2024