HiWave:基於小波擴散採樣的無訓練高分辨率圖像生成
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
June 25, 2025
作者: Tobias Vontobel, Seyedmorteza Sadat, Farnood Salehi, Romann M. Weber
cs.AI
摘要
扩散模型已成為圖像合成領域的主導方法,展現出卓越的逼真度與多樣性。然而,在高解析度下訓練擴散模型仍面臨計算資源的巨大挑戰,且現有的零樣本生成技術在合成超出訓練解析度的圖像時,常會產生諸如物體重複和空間不連貫等視覺瑕疵。本文提出HiWave,一種無需訓練的零樣本方法,利用預訓練的擴散模型顯著提升了超高解析度圖像合成的視覺逼真度與結構連貫性。我們的方法採用兩階段流程:首先從預訓練模型生成基礎圖像,隨後進行基於分塊的DDIM反演步驟及新穎的小波細節增強模塊。具體而言,我們首先利用反演方法從基礎圖像中提取保持全局連貫性的初始噪聲向量。接著,在採樣過程中,我們的小波域細節增強器保留基礎圖像的低頻成分以確保結構一致性,同時有選擇性地引導高頻成分以豐富細節與紋理。通過對Stable Diffusion XL的廣泛評估,HiWave有效減少了先前方法中常見的視覺瑕疵,達到了優異的感知質量。一項用戶研究證實了HiWave的表現,在超過80%的比較中,它被認為優於現有的最先進替代方案,凸顯了其在無需重新訓練或架構修改的情況下,實現高質量、超高解析度圖像合成的有效性。
English
Diffusion models have emerged as the leading approach for image synthesis,
demonstrating exceptional photorealism and diversity. However, training
diffusion models at high resolutions remains computationally prohibitive, and
existing zero-shot generation techniques for synthesizing images beyond
training resolutions often produce artifacts, including object duplication and
spatial incoherence. In this paper, we introduce HiWave, a training-free,
zero-shot approach that substantially enhances visual fidelity and structural
coherence in ultra-high-resolution image synthesis using pretrained diffusion
models. Our method employs a two-stage pipeline: generating a base image from
the pretrained model followed by a patch-wise DDIM inversion step and a novel
wavelet-based detail enhancer module. Specifically, we first utilize inversion
methods to derive initial noise vectors that preserve global coherence from the
base image. Subsequently, during sampling, our wavelet-domain detail enhancer
retains low-frequency components from the base image to ensure structural
consistency, while selectively guiding high-frequency components to enrich fine
details and textures. Extensive evaluations using Stable Diffusion XL
demonstrate that HiWave effectively mitigates common visual artifacts seen in
prior methods, achieving superior perceptual quality. A user study confirmed
HiWave's performance, where it was preferred over the state-of-the-art
alternative in more than 80% of comparisons, highlighting its effectiveness for
high-quality, ultra-high-resolution image synthesis without requiring
retraining or architectural modifications.