ViPer:通过个体偏好学习实现生成模型的视觉个性化
ViPer: Visual Personalization of Generative Models via Individual Preference Learning
July 24, 2024
作者: Sogand Salehi, Mahdi Shafiei, Teresa Yeo, Roman Bachmann, Amir Zamir
cs.AI
摘要
不同用户对相同提示生成的图像有不同的偏好。这导致了个性化图像生成,涉及创建与个体视觉偏好一致的图像。然而,当前的生成模型是不具个性化的,因为它们被调整为生成吸引广泛受众的输出。将它们用于生成与个别用户一致的图像依赖于用户通过迭代手动提示工程来调整,这种方式低效且不理想。我们提出通过首先在一次过程中捕获用户的通用偏好来个性化图像生成过程,邀请他们评论一小部分图像,并解释他们喜欢或不喜欢每个图像的原因。根据这些评论,我们利用大型语言模型推断用户的结构化喜欢和不喜欢的视觉属性,即他们的视觉偏好。这些属性用于引导文本到图像模型生成朝向个别用户视觉偏好调整的图像。通过一系列用户研究和大型语言模型引导的评估,我们证明了所提出的方法导致生成物与个别用户的视觉偏好良好一致。
English
Different users find different images generated for the same prompt
desirable. This gives rise to personalized image generation which involves
creating images aligned with an individual's visual preference. Current
generative models are, however, unpersonalized, as they are tuned to produce
outputs that appeal to a broad audience. Using them to generate images aligned
with individual users relies on iterative manual prompt engineering by the user
which is inefficient and undesirable. We propose to personalize the image
generation process by first capturing the generic preferences of the user in a
one-time process by inviting them to comment on a small selection of images,
explaining why they like or dislike each. Based on these comments, we infer a
user's structured liked and disliked visual attributes, i.e., their visual
preference, using a large language model. These attributes are used to guide a
text-to-image model toward producing images that are tuned towards the
individual user's visual preference. Through a series of user studies and large
language model guided evaluations, we demonstrate that the proposed method
results in generations that are well aligned with individual users' visual
preferences.Summary
AI-Generated Summary