ChatPaper.aiChatPaper

魔法插入:風格感知拖放

Magic Insert: Style-Aware Drag-and-Drop

July 2, 2024
作者: Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa, Yael Pritch, Michael Rubinstein, David E. Jacobs, Shlomi Fruchter
cs.AI

摘要

我們提出了Magic Insert,一種從用戶提供的圖像中將主題拖放到不同風格的目標圖像中的方法,以物理合理的方式進行,同時匹配目標圖像的風格。這項工作對風格感知的拖放問題進行了形式化,並提出了一種解決方法,通過解決兩個子問題來應對:風格感知個性化和在風格化圖像中進行逼真的物體插入。對於風格感知個性化,我們的方法首先使用LoRA和學習的文本標記在主題圖像上微調預訓練的文本到圖像擴散模型,然後將其注入目標風格的CLIP表示。對於物體插入,我們使用引導式領域適應將特定領域的逼真物體插入模型適應到多樣藝術風格的領域。整體而言,該方法在性能上顯著優於傳統方法,如修補。最後,我們提出了一個數據集SubjectPlop,以促進該領域的評估和未來進展。項目頁面:https://magicinsert.github.io/
English
We present Magic Insert, a method for dragging-and-dropping subjects from a user-provided image into a target image of a different style in a physically plausible manner while matching the style of the target image. This work formalizes the problem of style-aware drag-and-drop and presents a method for tackling it by addressing two sub-problems: style-aware personalization and realistic object insertion in stylized images. For style-aware personalization, our method first fine-tunes a pretrained text-to-image diffusion model using LoRA and learned text tokens on the subject image, and then infuses it with a CLIP representation of the target style. For object insertion, we use Bootstrapped Domain Adaption to adapt a domain-specific photorealistic object insertion model to the domain of diverse artistic styles. Overall, the method significantly outperforms traditional approaches such as inpainting. Finally, we present a dataset, SubjectPlop, to facilitate evaluation and future progress in this area. Project page: https://magicinsert.github.io/

Summary

AI-Generated Summary

PDF221November 28, 2024