KV-Edit：無需訓練的圖像編輯技術，實現精確背景保留

摘要

背景一致性仍然是圖像編輯任務中的一個重大挑戰。儘管已有廣泛的發展，現有方法仍面臨在保持與原始圖像相似性和生成符合目標內容之間的取捨。在此，我們提出了KV-Edit，這是一種無需訓練的方法，利用DiTs中的KV快取來維持背景一致性，其中背景標記被保留而非重新生成，從而消除了對複雜機制或昂貴訓練的需求，最終在用戶指定區域內生成與背景無縫融合的新內容。我們進一步探討了編輯過程中KV快取的記憶體消耗，並使用無反轉方法將空間複雜度優化至O(1)。我們的方法與任何基於DiT的生成模型兼容，無需額外訓練。實驗表明，KV-Edit在背景和圖像質量方面顯著優於現有方法，甚至超越了基於訓練的方法。項目網頁可訪問：https://xilluill.github.io/projectpages/KV-Edit。

English

Background consistency remains a significant challenge in image editing tasks. Despite extensive developments, existing works still face a trade-off between maintaining similarity to the original image and generating content that aligns with the target. Here, we propose KV-Edit, a training-free approach that uses KV cache in DiTs to maintain background consistency, where background tokens are preserved rather than regenerated, eliminating the need for complex mechanisms or expensive training, ultimately generating new content that seamlessly integrates with the background within user-provided regions. We further explore the memory consumption of the KV cache during editing and optimize the space complexity to O(1) using an inversion-free method. Our approach is compatible with any DiT-based generative model without additional training. Experiments demonstrate that KV-Edit significantly outperforms existing approaches in terms of both background and image quality, even surpassing training-based methods. Project webpage is available at https://xilluill.github.io/projectpages/KV-Edit

KV-Edit：無需訓練的圖像編輯技術，實現精確背景保留

KV-Edit: Training-Free Image Editing for Precise Background Preservation

摘要

Support