ChatPaper.aiChatPaper

使用單一圖像對自訂文本到圖像模型

Customizing Text-to-Image Models with a Single Image Pair

May 2, 2024
作者: Maxwell Jones, Sheng-Yu Wang, Nupur Kumari, David Bau, Jun-Yan Zhu
cs.AI

摘要

藝術重新詮釋是創作參考作品的變體的實踐,製作一幅展現獨特藝術風格的配對作品。我們探討這樣的圖像配對是否可以用來自定義生成模型,以捕捉展示的風格差異。我們提出了一種新的自定義方法,稱為配對自定義,該方法從單一圖像對中學習風格差異,然後將獲得的風格應用於生成過程。與現有方法不同,這些方法從圖像集合中學習模仿單一概念不同,我們的方法捕捉了配對圖像之間的風格差異。這使我們能夠應用風格變化,而不會過度擬合於示例中的特定圖像內容。為了應對這個新任務,我們採用聯合優化方法,明確將風格和內容分開為不同的 LoRA 權重空間。我們優化這些風格和內容權重以重現風格和內容圖像,同時鼓勵它們的正交性。在推論過程中,我們通過基於我們學到的權重的新風格引導修改擴散過程。定性和定量實驗都顯示,我們的方法可以有效地學習風格,同時避免過度擬合圖像內容,突顯了從單一圖像對中建模這種風格差異的潛力。
English
Art reinterpretation is the practice of creating a variation of a reference work, making a paired artwork that exhibits a distinct artistic style. We ask if such an image pair can be used to customize a generative model to capture the demonstrated stylistic difference. We propose Pair Customization, a new customization method that learns stylistic difference from a single image pair and then applies the acquired style to the generation process. Unlike existing methods that learn to mimic a single concept from a collection of images, our method captures the stylistic difference between paired images. This allows us to apply a stylistic change without overfitting to the specific image content in the examples. To address this new task, we employ a joint optimization method that explicitly separates the style and content into distinct LoRA weight spaces. We optimize these style and content weights to reproduce the style and content images while encouraging their orthogonality. During inference, we modify the diffusion process via a new style guidance based on our learned weights. Both qualitative and quantitative experiments show that our method can effectively learn style while avoiding overfitting to image content, highlighting the potential of modeling such stylistic differences from a single image pair.

Summary

AI-Generated Summary

PDF231December 15, 2024