HiWave:基于小波扩散采样的免训练高分辨率图像生成
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
June 25, 2025
作者: Tobias Vontobel, Seyedmorteza Sadat, Farnood Salehi, Romann M. Weber
cs.AI
摘要
扩散模型已成为图像合成领域的领先方法,展现出卓越的逼真度和多样性。然而,在高分辨率下训练扩散模型仍然面临计算成本过高的问题,而现有的零样本生成技术在合成超出训练分辨率的图像时,常常会产生伪影,包括物体重复和空间不连贯。本文提出HiWave,一种无需训练、零样本的方法,利用预训练扩散模型显著提升了超高分辨率图像合成的视觉保真度和结构一致性。我们的方法采用两阶段流程:首先从预训练模型生成基础图像,随后进行分块DDIM反演步骤,并引入一种新颖的基于小波的细节增强模块。具体而言,我们首先利用反演方法从基础图像中提取保持全局一致性的初始噪声向量。接着,在采样过程中,我们的小波域细节增强器保留基础图像的低频成分以确保结构一致性,同时有选择性地引导高频成分以丰富细节和纹理。通过使用Stable Diffusion XL进行的广泛评估表明,HiWave有效缓解了先前方法中常见的视觉伪影,实现了卓越的感知质量。一项用户研究证实了HiWave的性能,在超过80%的比较中,用户更倾向于选择HiWave而非当前最先进的替代方案,凸显了其在无需重新训练或架构修改的情况下,实现高质量、超高分辨率图像合成的有效性。
English
Diffusion models have emerged as the leading approach for image synthesis,
demonstrating exceptional photorealism and diversity. However, training
diffusion models at high resolutions remains computationally prohibitive, and
existing zero-shot generation techniques for synthesizing images beyond
training resolutions often produce artifacts, including object duplication and
spatial incoherence. In this paper, we introduce HiWave, a training-free,
zero-shot approach that substantially enhances visual fidelity and structural
coherence in ultra-high-resolution image synthesis using pretrained diffusion
models. Our method employs a two-stage pipeline: generating a base image from
the pretrained model followed by a patch-wise DDIM inversion step and a novel
wavelet-based detail enhancer module. Specifically, we first utilize inversion
methods to derive initial noise vectors that preserve global coherence from the
base image. Subsequently, during sampling, our wavelet-domain detail enhancer
retains low-frequency components from the base image to ensure structural
consistency, while selectively guiding high-frequency components to enrich fine
details and textures. Extensive evaluations using Stable Diffusion XL
demonstrate that HiWave effectively mitigates common visual artifacts seen in
prior methods, achieving superior perceptual quality. A user study confirmed
HiWave's performance, where it was preferred over the state-of-the-art
alternative in more than 80% of comparisons, highlighting its effectiveness for
high-quality, ultra-high-resolution image synthesis without requiring
retraining or architectural modifications.