PALP：文本到图像模型的提示对齐个性化

摘要

内容创作者通常旨在使用个人主题创建个性化图像，超越传统文本到图像模型的能力。此外，他们可能希望生成的图像涵盖特定位置、风格、氛围等。现有的个性化方法可能会影响个性化能力或与复杂文本提示的对齐。这种权衡可能会妨碍用户提示的实现和主题忠实度。我们提出了一种新方法，专注于单个提示的个性化方法，以解决这个问题。我们将我们的方法称为提示对齐个性化。虽然这可能看起来有限制，但我们的方法在改善文本对齐方面表现出色，能够创建具有复杂和复杂提示的图像，这可能对当前技术构成挑战。特别是，我们的方法通过额外的分数蒸馏采样项保持个性化模型与目标提示对齐。我们展示了我们的方法在多次和单次设置中的多功能性，并进一步展示它可以组合多个主题或从参考图像（如艺术作品）中汲取灵感。我们定量和定性地将我们的方法与现有基线和最先进技术进行了比较。

English

Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impede the fulfillment of user prompts and subject fidelity. We propose a new approach focusing on personalization methods for a single prompt to address this issue. We term our approach prompt-aligned personalization. While this may seem restrictive, our method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques. In particular, our method keeps the personalized model aligned with a target prompt using an additional score distillation sampling term. We demonstrate the versatility of our method in multi- and single-shot settings and further show that it can compose multiple subjects or use inspiration from reference images, such as artworks. We compare our approach quantitatively and qualitatively with existing baselines and state-of-the-art techniques.

PALP：文本到图像模型的提示对齐个性化

PALP: Prompt Aligned Personalization of Text-to-Image Models

摘要

Support