一步殘差遷移擴散:基於蒸餾的圖像超分辨率方法
One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation
March 17, 2025
作者: Daniil Selikhanovych, David Li, Aleksei Leonov, Nikita Gushchin, Sergei Kushneriuk, Alexander Filippov, Evgeny Burnaev, Iaroslav Koshelev, Alexander Korotin
cs.AI
摘要
超分辨率(SR)的擴散模型能夠生成高質量的視覺效果,但需要昂貴的計算成本。儘管已經開發了多種方法來加速基於擴散的SR模型,但一些方法(例如SinSR)未能產生逼真的感知細節,而其他方法(例如OSEDiff)可能會虛構不存在的結構。為了解決這些問題,我們提出了RSD,這是一種新的蒸餾方法,適用於ResShift,這是頂級的基於擴散的SR模型之一。我們的方法基於訓練學生網絡生成這樣的圖像,使得在這些圖像上訓練的新假ResShift模型將與教師模型一致。RSD實現了單步恢復,並且大幅超越了教師模型。我們展示了我們的蒸餾方法可以超越其他基於ResShift的蒸餾方法——SinSR,使其與最先進的基於擴散的SR蒸餾方法相媲美。與基於預訓練文本到圖像模型的SR方法相比,RSD產生了具有競爭力的感知質量,提供了與退化輸入圖像更好對齊的圖像,並且需要更少的參數和GPU內存。我們在各種真實世界和合成數據集上提供了實驗結果,包括RealSR、RealSet65、DRealSR、ImageNet和DIV2K。
English
Diffusion models for super-resolution (SR) produce high-quality visual
results but require expensive computational costs. Despite the development of
several methods to accelerate diffusion-based SR models, some (e.g., SinSR)
fail to produce realistic perceptual details, while others (e.g., OSEDiff) may
hallucinate non-existent structures. To overcome these issues, we present RSD,
a new distillation method for ResShift, one of the top diffusion-based SR
models. Our method is based on training the student network to produce such
images that a new fake ResShift model trained on them will coincide with the
teacher model. RSD achieves single-step restoration and outperforms the teacher
by a large margin. We show that our distillation method can surpass the other
distillation-based method for ResShift - SinSR - making it on par with
state-of-the-art diffusion-based SR distillation methods. Compared to SR
methods based on pre-trained text-to-image models, RSD produces competitive
perceptual quality, provides images with better alignment to degraded input
images, and requires fewer parameters and GPU memory. We provide experimental
results on various real-world and synthetic datasets, including RealSR,
RealSet65, DRealSR, ImageNet, and DIV2K.Summary
AI-Generated Summary