基于保真度增强解码器引导的一步扩散模型实现快速图像压缩
Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression
August 7, 2025
作者: Zheng Chen, Mingde Zhou, Jinpei Guo, Jiale Yuan, Yifei Ji, Yulun Zhang
cs.AI
摘要
基于扩散模型的图像压缩技术已展现出卓越的感知性能,但其存在两大关键缺陷:一是多步采样导致的解码延迟过高,二是过度依赖生成先验造成的保真度不足。针对这些问题,我们提出了SODEC,一种创新的单步扩散图像压缩模型。我们主张,在图像压缩中,一个信息量充足的潜在表示足以避免多步细化的需求。基于这一洞见,我们利用预训练的VAE模型生成富含信息的潜在表示,并以单步解码替代迭代去噪过程。同时,为提高保真度,我们引入了保真度引导模块,促使输出忠实于原图像。此外,我们设计了码率退火训练策略,以在极低比特率下实现有效训练。大量实验表明,SODEC显著超越现有方法,在率失真感知性能上达到领先水平。相较于以往的基于扩散的压缩模型,SODEC将解码速度提升了超过20倍。代码已发布于:https://github.com/zhengchen1999/SODEC。
English
Diffusion-based image compression has demonstrated impressive perceptual
performance. However, it suffers from two critical drawbacks: (1) excessive
decoding latency due to multi-step sampling, and (2) poor fidelity resulting
from over-reliance on generative priors. To address these issues, we propose
SODEC, a novel single-step diffusion image compression model. We argue that in
image compression, a sufficiently informative latent renders multi-step
refinement unnecessary. Based on this insight, we leverage a pre-trained
VAE-based model to produce latents with rich information, and replace the
iterative denoising process with a single-step decoding. Meanwhile, to improve
fidelity, we introduce the fidelity guidance module, encouraging output that is
faithful to the original image. Furthermore, we design the rate annealing
training strategy to enable effective training under extremely low bitrates.
Extensive experiments show that SODEC significantly outperforms existing
methods, achieving superior rate-distortion-perception performance. Moreover,
compared to previous diffusion-based compression models, SODEC improves
decoding speed by more than 20times. Code is released at:
https://github.com/zhengchen1999/SODEC.