ChatPaper.aiChatPaper

ViPer:透過個人偏好學習實現生成模型的視覺個性化

ViPer: Visual Personalization of Generative Models via Individual Preference Learning

July 24, 2024
作者: Sogand Salehi, Mahdi Shafiei, Teresa Yeo, Roman Bachmann, Amir Zamir
cs.AI

摘要

不同使用者對於相同提示生成的圖像有不同的偏好。這導致了個性化圖像生成,涉及創建符合個人視覺喜好的圖像。然而,目前的生成模型是不個性化的,因為它們被調整為產生對廣大觀眾有吸引力的輸出。將它們用於生成符合個別使用者的圖像依賴於使用者進行迭代手動提示工程,這是低效且不理想的。我們提出通過首先在一次過程中捕捉使用者的通用偏好來個性化圖像生成過程,邀請他們對少量圖像發表評論,解釋他們喜歡或不喜歡每張圖像的原因。根據這些評論,我們利用大型語言模型推斷使用者的結構化喜歡和不喜歡的視覺特徵,即他們的視覺偏好。這些特徵被用來引導文本到圖像模型,使其生成的圖像調整為個別使用者的視覺喜好。通過一系列使用者研究和大型語言模型引導的評估,我們展示了所提出的方法導致生成物與個別使用者的視覺喜好非常吻合。
English
Different users find different images generated for the same prompt desirable. This gives rise to personalized image generation which involves creating images aligned with an individual's visual preference. Current generative models are, however, unpersonalized, as they are tuned to produce outputs that appeal to a broad audience. Using them to generate images aligned with individual users relies on iterative manual prompt engineering by the user which is inefficient and undesirable. We propose to personalize the image generation process by first capturing the generic preferences of the user in a one-time process by inviting them to comment on a small selection of images, explaining why they like or dislike each. Based on these comments, we infer a user's structured liked and disliked visual attributes, i.e., their visual preference, using a large language model. These attributes are used to guide a text-to-image model toward producing images that are tuned towards the individual user's visual preference. Through a series of user studies and large language model guided evaluations, we demonstrate that the proposed method results in generations that are well aligned with individual users' visual preferences.

Summary

AI-Generated Summary

PDF132November 28, 2024