고속 이미지 압축을 위한 충실도가 높은 디코더를 활용한 일단계 확산 모델 스티어링

초록

확산 기반 이미지 압축은 인지적 성능 면에서 뛰어난 성과를 보여왔습니다. 그러나 이 방법은 두 가지 중요한 단점을 가지고 있습니다: (1) 다단계 샘플링으로 인한 과도한 디코딩 지연 시간, 그리고 (2) 생성적 사전 정보에 대한 과도한 의존으로 인한 낮은 충실도. 이러한 문제를 해결하기 위해, 우리는 SODEC라는 새로운 단일 단계 확산 이미지 압축 모델을 제안합니다. 우리는 이미지 압축에서 충분히 정보가 풍부한 잠재 공간은 다단계 정제를 불필요하게 만든다고 주장합니다. 이러한 통찰을 바탕으로, 우리는 사전 훈련된 VAE 기반 모델을 활용하여 정보가 풍부한 잠재 공간을 생성하고, 반복적인 노이즈 제거 과정을 단일 단계 디코딩으로 대체합니다. 동시에, 충실도를 향상시키기 위해 원본 이미지에 충실한 출력을 유도하는 충실도 가이던스 모듈을 도입합니다. 또한, 극도로 낮은 비트레이트에서도 효과적인 훈련이 가능하도록 비율 어닐링 훈련 전략을 설계합니다. 광범위한 실험을 통해 SODEC가 기존 방법들을 크게 능가하며, 우수한 비율-왜곡-인지 성능을 달성함을 보여줍니다. 더 나아가, 이전의 확산 기반 압축 모델과 비교하여 SODEC는 디코딩 속도를 20배 이상 향상시킵니다. 코드는 https://github.com/zhengchen1999/SODEC에서 공개되었습니다.

English

Diffusion-based image compression has demonstrated impressive perceptual performance. However, it suffers from two critical drawbacks: (1) excessive decoding latency due to multi-step sampling, and (2) poor fidelity resulting from over-reliance on generative priors. To address these issues, we propose SODEC, a novel single-step diffusion image compression model. We argue that in image compression, a sufficiently informative latent renders multi-step refinement unnecessary. Based on this insight, we leverage a pre-trained VAE-based model to produce latents with rich information, and replace the iterative denoising process with a single-step decoding. Meanwhile, to improve fidelity, we introduce the fidelity guidance module, encouraging output that is faithful to the original image. Furthermore, we design the rate annealing training strategy to enable effective training under extremely low bitrates. Extensive experiments show that SODEC significantly outperforms existing methods, achieving superior rate-distortion-perception performance. Moreover, compared to previous diffusion-based compression models, SODEC improves decoding speed by more than 20times. Code is released at: https://github.com/zhengchen1999/SODEC.

고속 이미지 압축을 위한 충실도가 높은 디코더를 활용한 일단계 확산 모델 스티어링

Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression

초록

Support