艺术家：无需训练即可在视觉上控制的文本驱动风格化

摘要

扩散模型在去噪过程中纠缠了内容和风格生成，直接应用于样式化任务时会导致内容修改不受欢迎。现有方法难以有效控制扩散模型以满足样式化的审美要求。本文介绍了一种名为Artist的无需训练的方法，用于审美地控制预训练扩散模型的内容和风格生成，以实现文本驱动的样式化。我们的关键见解是将内容和风格的去噪分开为两个独立的扩散过程，同时在它们之间共享信息。我们提出了简单而有效的内容和风格控制方法，抑制了与风格无关的内容生成，从而产生和谐的样式化结果。大量实验证明我们的方法在实现审美级别的样式化要求方面表现出色，保留了内容图像中的复杂细节，并与样式提示很好地契合。此外，我们展示了从各个角度高度可控的样式化强度。代码将被发布，项目主页：https://DiffusionArtist.github.io

English

Diffusion models entangle content and style generation during the denoising process, leading to undesired content modification when directly applied to stylization tasks. Existing methods struggle to effectively control the diffusion model to meet the aesthetic-level requirements for stylization. In this paper, we introduce Artist, a training-free approach that aesthetically controls the content and style generation of a pretrained diffusion model for text-driven stylization. Our key insight is to disentangle the denoising of content and style into separate diffusion processes while sharing information between them. We propose simple yet effective content and style control methods that suppress style-irrelevant content generation, resulting in harmonious stylization results. Extensive experiments demonstrate that our method excels at achieving aesthetic-level stylization requirements, preserving intricate details in the content image and aligning well with the style prompt. Furthermore, we showcase the highly controllability of the stylization strength from various perspectives. Code will be released, project home page: https://DiffusionArtist.github.io

艺术家：无需训练即可在视觉上控制的文本驱动风格化

Artist: Aesthetically Controllable Text-Driven Stylization without Training

摘要

Support