ChatPaper.aiChatPaper

魔法插入:风格感知拖放

Magic Insert: Style-Aware Drag-and-Drop

July 2, 2024
作者: Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa, Yael Pritch, Michael Rubinstein, David E. Jacobs, Shlomi Fruchter
cs.AI

摘要

我们提出了Magic Insert,这是一种方法,可以以物理上合理的方式从用户提供的图像中拖放主题到不同风格的目标图像中,同时匹配目标图像的风格。这项工作将风格感知拖放的问题形式化,并提出了一种解决方法,通过解决两个子问题:风格感知个性化和在风格化图像中实现逼真的对象插入。对于风格感知个性化,我们的方法首先使用LoRA和学习的文本标记在主题图像上微调预训练的文本到图像扩散模型,然后将其与目标风格的CLIP表示相融合。对于对象插入,我们使用引导域自适应将特定领域的逼真对象插入模型适应到多样艺术风格的领域。总体而言,该方法在性能上明显优于传统方法,如修补。最后,我们提出了一个数据集SubjectPlop,以促进在这一领域的评估和未来进展。项目页面:https://magicinsert.github.io/
English
We present Magic Insert, a method for dragging-and-dropping subjects from a user-provided image into a target image of a different style in a physically plausible manner while matching the style of the target image. This work formalizes the problem of style-aware drag-and-drop and presents a method for tackling it by addressing two sub-problems: style-aware personalization and realistic object insertion in stylized images. For style-aware personalization, our method first fine-tunes a pretrained text-to-image diffusion model using LoRA and learned text tokens on the subject image, and then infuses it with a CLIP representation of the target style. For object insertion, we use Bootstrapped Domain Adaption to adapt a domain-specific photorealistic object insertion model to the domain of diverse artistic styles. Overall, the method significantly outperforms traditional approaches such as inpainting. Finally, we present a dataset, SubjectPlop, to facilitate evaluation and future progress in this area. Project page: https://magicinsert.github.io/

Summary

AI-Generated Summary

PDF221November 28, 2024