ChatPaper.aiChatPaper

基于评分的生成模型的高保真图像压缩

High-Fidelity Image Compression with Score-based Generative Models

May 26, 2023
作者: Emiel Hoogeboom, Eirikur Agustsson, Fabian Mentzer, Luca Versari, George Toderici, Lucas Theis
cs.AI

摘要

尽管扩散生成模型在文本到图像生成领域取得了巨大成功,但在图像压缩领域复制这一成功却颇具挑战。本文中,我们展示了扩散能够显著提高在给定比特率下的感知质量,通过 FID 分数的衡量超越了PO-ELIC和HiFiC等最先进方法。我们采用了一个简单但理论上有动机的两阶段方法,首先是针对均方误差的自编码器,然后是基于分数的进一步解码器。然而,正如我们将展示的,实现细节至关重要,最佳设计决策可能与典型的文本到图像模型大相径庭。
English
Despite the tremendous success of diffusion generative models in text-to-image generation, replicating this success in the domain of image compression has proven difficult. In this paper, we demonstrate that diffusion can significantly improve perceptual quality at a given bit-rate, outperforming state-of-the-art approaches PO-ELIC and HiFiC as measured by FID score. This is achieved using a simple but theoretically motivated two-stage approach combining an autoencoder targeting MSE followed by a further score-based decoder. However, as we will show, implementation details matter and the optimal design decisions can differ greatly from typical text-to-image models.
PDF11December 15, 2024