ChatPaper.aiChatPaper

PALP:文本到圖像模型的提示對齊個性化

PALP: Prompt Aligned Personalization of Text-to-Image Models

January 11, 2024
作者: Moab Arar, Andrey Voynov, Amir Hertz, Omri Avrahami, Shlomi Fruchter, Yael Pritch, Daniel Cohen-Or, Ariel Shamir
cs.AI

摘要

內容創作者通常致力於使用個人主題創建個性化圖像,超越傳統文本轉圖像模型的能力。此外,他們可能希望生成的圖像涵蓋特定位置、風格、氛圍等。現有的個性化方法可能會影響個性化能力或與複雜文本提示的一致性。這種權衡可能阻礙用戶提示和主題忠實度的實現。我們提出了一種新方法,專注於單個提示的個性化方法,以解決這個問題。我們將我們的方法稱為提示對齊個性化。儘管這可能看似受限,我們的方法在改善文本對齊方面表現出色,能夠創建具有複雜和精細提示的圖像,這可能對當前技術構成挑戰。特別是,我們的方法通過額外的分數蒸餾取樣項將個性化模型與目標提示對齊。我們展示了我們的方法在多次和單次拍攝設置中的多功能性,並進一步展示了它可以組合多個主題或從參考圖像(如藝術作品)中獲取靈感。我們定量和定性地將我們的方法與現有基準和最先進技術進行比較。
English
Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impede the fulfillment of user prompts and subject fidelity. We propose a new approach focusing on personalization methods for a single prompt to address this issue. We term our approach prompt-aligned personalization. While this may seem restrictive, our method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques. In particular, our method keeps the personalized model aligned with a target prompt using an additional score distillation sampling term. We demonstrate the versatility of our method in multi- and single-shot settings and further show that it can compose multiple subjects or use inspiration from reference images, such as artworks. We compare our approach quantitatively and qualitatively with existing baselines and state-of-the-art techniques.
PDF502December 15, 2024