ChatPaper.aiChatPaper

NeuroPrompts:一個適應性框架,用於優化文本到圖像生成的提示

NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

November 20, 2023
作者: Shachar Rosenman, Vasudev Lal, Phillip Howard
cs.AI

摘要

儘管最近在文本轉圖像擴散模型方面取得了令人印象深刻的進展,但要獲得高質量的圖像通常需要人類迅速進行工程處理,而這些人類已經在使用它們方面具有專業知識。在這項工作中,我們提出了NeuroPrompts,一個自適應框架,可自動增強用戶的提示,以改善文本轉圖像模型生成的質量。我們的框架利用受過訓練的語言模型進行限制文本解碼,該模型已經適應生成與人類提示工程師生成的提示類似的提示。這種方法可以實現更高質量的文本轉圖像生成,並通過約束集規範提供用戶對風格特徵的控制。我們通過使用Stable Diffusion創建了一個交互式應用程序,用於提示增強和圖像生成,展示了我們框架的實用性。此外,我們利用一個大型數據集進行實驗,該數據集包含人工生成的文本轉圖像提示,並展示了我們的方法自動生成的增強提示能夠產生更優質的圖像。我們將我們的代碼、一個屏幕錄像演示和NeuroPrompts的實時演示實例公開提供。
English
Despite impressive recent advances in text-to-image diffusion models, obtaining high-quality images often requires prompt engineering by humans who have developed expertise in using them. In this work, we present NeuroPrompts, an adaptive framework that automatically enhances a user's prompt to improve the quality of generations produced by text-to-image models. Our framework utilizes constrained text decoding with a pre-trained language model that has been adapted to generate prompts similar to those produced by human prompt engineers. This approach enables higher-quality text-to-image generations and provides user control over stylistic features via constraint set specification. We demonstrate the utility of our framework by creating an interactive application for prompt enhancement and image generation using Stable Diffusion. Additionally, we conduct experiments utilizing a large dataset of human-engineered prompts for text-to-image generation and show that our approach automatically produces enhanced prompts that result in superior image quality. We make our code, a screencast video demo and a live demo instance of NeuroPrompts publicly available.
PDF273December 15, 2024