神经提示：用于优化文本到图像生成的自适应框架

摘要

尽管最近文本到图像扩散模型取得了令人印象深刻的进展，但要获得高质量图像通常需要人类迅速进行工程处理，这些人类已经在使用中积累了专业知识。在这项工作中，我们提出了NeuroPrompts，这是一个自适应框架，可以自动增强用户的提示，以改善文本到图像模型生成的质量。我们的框架利用受限文本解码与经过训练的语言模型，该模型已经适应生成类似于人类提示工程师生成的提示。这种方法实现了更高质量的文本到图像生成，并通过约束集规范提供用户对风格特征的控制。我们通过创建一个基于Stable Diffusion的交互式应用程序来展示我们框架的实用性。此外，我们利用大量人类设计的提示数据集进行实验，并展示我们的方法自动产生的增强提示会导致更优质的图像质量。我们将我们的代码、一个屏幕录像演示视频和NeuroPrompts的实时演示实例公开提供。

English

Despite impressive recent advances in text-to-image diffusion models, obtaining high-quality images often requires prompt engineering by humans who have developed expertise in using them. In this work, we present NeuroPrompts, an adaptive framework that automatically enhances a user's prompt to improve the quality of generations produced by text-to-image models. Our framework utilizes constrained text decoding with a pre-trained language model that has been adapted to generate prompts similar to those produced by human prompt engineers. This approach enables higher-quality text-to-image generations and provides user control over stylistic features via constraint set specification. We demonstrate the utility of our framework by creating an interactive application for prompt enhancement and image generation using Stable Diffusion. Additionally, we conduct experiments utilizing a large dataset of human-engineered prompts for text-to-image generation and show that our approach automatically produces enhanced prompts that result in superior image quality. We make our code, a screencast video demo and a live demo instance of NeuroPrompts publicly available.

神经提示：用于优化文本到图像生成的自适应框架

NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

摘要

Support