適應性文本到圖像生成的提示擴展
Prompt Expansion for Adaptive Text-to-Image Generation
December 27, 2023
作者: Siddhartha Datta, Alexander Ku, Deepak Ramachandran, Peter Anderson
cs.AI
摘要
文字到圖像生成模型雖然功能強大,但使用起來卻很困難。用戶需精心設計特定提示以獲得更好的圖像,然而這些圖像可能會重複。本文提出了一個提示擴展框架,幫助用戶以更少的努力生成高質量、多樣化的圖像。提示擴展模型以文本查詢作為輸入,輸出一組擴展的文本提示,經過優化,當傳遞給文字到圖像模型時,生成更廣泛、吸引人的圖像。我們進行了一項人類評估研究,結果顯示通過提示擴展生成的圖像在美學上更為吸引人且多樣化,優於基準方法生成的圖像。總的來說,本文提出了一種新穎且有效的方法來改善文字到圖像生成的體驗。
English
Text-to-image generation models are powerful but difficult to use. Users
craft specific prompts to get better images, though the images can be
repetitive. This paper proposes a Prompt Expansion framework that helps users
generate high-quality, diverse images with less effort. The Prompt Expansion
model takes a text query as input and outputs a set of expanded text prompts
that are optimized such that when passed to a text-to-image model, generates a
wider variety of appealing images. We conduct a human evaluation study that
shows that images generated through Prompt Expansion are more aesthetically
pleasing and diverse than those generated by baseline methods. Overall, this
paper presents a novel and effective approach to improving the text-to-image
generation experience.