ChatPaper.aiChatPaper

ZePo:零样式肖像快速采样

ZePo: Zero-Shot Portrait Stylization with Faster Sampling

August 10, 2024
作者: Jin Liu, Huaibo Huang, Jie Cao, Ran He
cs.AI

摘要

基于扩散的文本到图像生成模型显著推进了艺术内容合成领域。然而,当前的肖像风格化方法通常要求基于示例进行模型微调,或者采用DDIM反演将图像恢复为噪声空间,这两者都会显著减缓图像生成过程。为了克服这些限制,本文提出了一种基于扩散模型的无反演肖像风格化框架,仅通过四个采样步骤实现内容和风格特征融合。我们发现,采用一致性蒸馏的潜在一致性模型可以有效从嘈杂图像中提取代表性的一致性特征。为了融合从内容和风格图像中提取的一致性特征,我们引入了一种风格增强注意力控制技术,精心在目标图像的注意力空间内合并内容和风格特征。此外,我们提出了一种特征融合策略,将一致性特征中的冗余特征合并,从而降低注意力控制的计算负载。大量实验证实了我们提出的框架在提高风格化效率和保真度方面的有效性。代码可在https://github.com/liujin112/ZePo找到。
English
Diffusion-based text-to-image generation models have significantly advanced the field of art content synthesis. However, current portrait stylization methods generally require either model fine-tuning based on examples or the employment of DDIM Inversion to revert images to noise space, both of which substantially decelerate the image generation process. To overcome these limitations, this paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps. We observed that Latent Consistency Models employing consistency distillation can effectively extract representative Consistency Features from noisy images. To blend the Consistency Features extracted from both content and style images, we introduce a Style Enhancement Attention Control technique that meticulously merges content and style features within the attention space of the target image. Moreover, we propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control. Extensive experiments have validated the effectiveness of our proposed framework in enhancing stylization efficiency and fidelity. The code is available at https://github.com/liujin112/ZePo.

Summary

AI-Generated Summary

PDF72November 28, 2024