PhotoDoodle:從少量成對數據中學習藝術圖像編輯
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data
February 20, 2025
作者: Shijie Huang, Yiren Song, Yuxuan Zhang, Hailong Guo, Xueyin Wang, Mike Zheng Shou, Jiaming Liu
cs.AI
摘要
我們介紹了PhotoDoodle,這是一個新穎的圖像編輯框架,旨在通過讓藝術家能夠在照片上疊加裝飾元素來促進照片塗鴉。照片塗鴉具有挑戰性,因為插入的元素必須與背景無縫融合,這需要真實的混合、透視對齊和上下文一致性。此外,背景必須保持不變形,並且藝術家的獨特風格必須從有限的訓練數據中高效捕捉。這些需求在以往主要關注全局風格遷移或區域修復的方法中並未得到解決。所提出的方法PhotoDoodle採用了一種兩階段的訓練策略。首先,我們使用大規模數據訓練一個通用圖像編輯模型OmniEditor。隨後,我們使用EditLoRA對該模型進行微調,利用藝術家精心挑選的前後圖像對小數據集來捕捉獨特的編輯風格和技巧。為了增強生成結果的一致性,我們引入了一種位置編碼重用機制。此外,我們發布了一個包含六種高質量風格的PhotoDoodle數據集。大量實驗證明,我們的方法在定制圖像編輯方面具有先進的性能和魯棒性,為藝術創作開闢了新的可能性。
English
We introduce PhotoDoodle, a novel image editing framework designed to
facilitate photo doodling by enabling artists to overlay decorative elements
onto photographs. Photo doodling is challenging because the inserted elements
must appear seamlessly integrated with the background, requiring realistic
blending, perspective alignment, and contextual coherence. Additionally, the
background must be preserved without distortion, and the artist's unique style
must be captured efficiently from limited training data. These requirements are
not addressed by previous methods that primarily focus on global style transfer
or regional inpainting. The proposed method, PhotoDoodle, employs a two-stage
training strategy. Initially, we train a general-purpose image editing model,
OmniEditor, using large-scale data. Subsequently, we fine-tune this model with
EditLoRA using a small, artist-curated dataset of before-and-after image pairs
to capture distinct editing styles and techniques. To enhance consistency in
the generated results, we introduce a positional encoding reuse mechanism.
Additionally, we release a PhotoDoodle dataset featuring six high-quality
styles. Extensive experiments demonstrate the advanced performance and
robustness of our method in customized image editing, opening new possibilities
for artistic creation.Summary
AI-Generated Summary