ChatPaper.aiChatPaper

PALP:文本到图像模型的提示对齐个性化

PALP: Prompt Aligned Personalization of Text-to-Image Models

January 11, 2024
作者: Moab Arar, Andrey Voynov, Amir Hertz, Omri Avrahami, Shlomi Fruchter, Yael Pritch, Daniel Cohen-Or, Ariel Shamir
cs.AI

摘要

内容创作者常希望利用个性化主体生成图像,这已超出传统文生图模型的能力范围。此外,他们可能还期望生成的图像能包含特定场景、风格、氛围等要素。现有个性化方法往往需要在个性化能力与复杂文本提示的匹配度之间进行权衡,这种折衷可能影响用户提示的完整实现与主体保真度。为此,我们提出一种面向单一提示的个性化新方法,称为提示对齐个性化。尽管看似受限,但该方法能显著提升文本对齐效果,支持生成符合复杂精细提示的图像——这对现有技术而言颇具挑战性。具体而言,我们通过引入额外的分数蒸馏采样项,使个性化模型始终与目标提示保持对齐。实验表明,我们的方法在单样本及多样本场景下均具卓越适应性,不仅能组合多个主体,还能从艺术画作等参考图像中汲取灵感。通过定量与定性分析,我们将所提方法与现有基线及前沿技术进行了全面对比。
English
Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impede the fulfillment of user prompts and subject fidelity. We propose a new approach focusing on personalization methods for a single prompt to address this issue. We term our approach prompt-aligned personalization. While this may seem restrictive, our method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques. In particular, our method keeps the personalized model aligned with a target prompt using an additional score distillation sampling term. We demonstrate the versatility of our method in multi- and single-shot settings and further show that it can compose multiple subjects or use inspiration from reference images, such as artworks. We compare our approach quantitatively and qualitatively with existing baselines and state-of-the-art techniques.
PDF492April 9, 2026