PhotoDoodle：從少量成對數據中學習藝術圖像編輯

摘要

我們介紹了PhotoDoodle，這是一個新穎的圖像編輯框架，旨在通過讓藝術家能夠在照片上疊加裝飾元素來促進照片塗鴉。照片塗鴉具有挑戰性，因為插入的元素必須與背景無縫融合，這需要真實的混合、透視對齊和上下文一致性。此外，背景必須保持不變形，並且藝術家的獨特風格必須從有限的訓練數據中高效捕捉。這些需求在以往主要關注全局風格遷移或區域修復的方法中並未得到解決。所提出的方法PhotoDoodle採用了一種兩階段的訓練策略。首先，我們使用大規模數據訓練一個通用圖像編輯模型OmniEditor。隨後，我們使用EditLoRA對該模型進行微調，利用藝術家精心挑選的前後圖像對小數據集來捕捉獨特的編輯風格和技巧。為了增強生成結果的一致性，我們引入了一種位置編碼重用機制。此外，我們發布了一個包含六種高質量風格的PhotoDoodle數據集。大量實驗證明，我們的方法在定制圖像編輯方面具有先進的性能和魯棒性，為藝術創作開闢了新的可能性。

English

We introduce PhotoDoodle, a novel image editing framework designed to facilitate photo doodling by enabling artists to overlay decorative elements onto photographs. Photo doodling is challenging because the inserted elements must appear seamlessly integrated with the background, requiring realistic blending, perspective alignment, and contextual coherence. Additionally, the background must be preserved without distortion, and the artist's unique style must be captured efficiently from limited training data. These requirements are not addressed by previous methods that primarily focus on global style transfer or regional inpainting. The proposed method, PhotoDoodle, employs a two-stage training strategy. Initially, we train a general-purpose image editing model, OmniEditor, using large-scale data. Subsequently, we fine-tune this model with EditLoRA using a small, artist-curated dataset of before-and-after image pairs to capture distinct editing styles and techniques. To enhance consistency in the generated results, we introduce a positional encoding reuse mechanism. Additionally, we release a PhotoDoodle dataset featuring six high-quality styles. Extensive experiments demonstrate the advanced performance and robustness of our method in customized image editing, opening new possibilities for artistic creation.

PhotoDoodle：從少量成對數據中學習藝術圖像編輯

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

摘要

Support