PerCoV2:基于隐式分层掩码图像建模的改进型超低比特率感知图像压缩
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
March 12, 2025
作者: Nikolai Körber, Eduard Kromer, Andreas Siebert, Sascha Hauke, Daniel Mueller-Gritschneder, Björn Schuller
cs.AI
摘要
我們推出PerCoV2,這是一種新穎且開放的超低比特率感知圖像壓縮系統,專為帶寬和存儲受限的應用而設計。基於Careil等人的先前工作,PerCoV2將原始框架擴展至Stable Diffusion 3生態系統,並通過顯式建模離散超潛在圖像分佈來提升熵編碼效率。為此,我們對最新的自回歸方法(VAR和MaskGIT)進行了全面的比較,並在大規模MSCOCO-30k基準上評估了我們的方法。與之前的工作相比,PerCoV2(i)在更低的比特率下實現了更高的圖像保真度,同時保持了競爭性的感知質量,(ii)引入了混合生成模式以進一步節省比特率,以及(iii)完全基於公開組件構建。代碼和訓練模型將在https://github.com/Nikolai10/PerCoV2上發布。
English
We introduce PerCoV2, a novel and open ultra-low bit-rate perceptual image
compression system designed for bandwidth- and storage-constrained
applications. Building upon prior work by Careil et al., PerCoV2 extends the
original formulation to the Stable Diffusion 3 ecosystem and enhances entropy
coding efficiency by explicitly modeling the discrete hyper-latent image
distribution. To this end, we conduct a comprehensive comparison of recent
autoregressive methods (VAR and MaskGIT) for entropy modeling and evaluate our
approach on the large-scale MSCOCO-30k benchmark. Compared to previous work,
PerCoV2 (i) achieves higher image fidelity at even lower bit-rates while
maintaining competitive perceptual quality, (ii) features a hybrid generation
mode for further bit-rate savings, and (iii) is built solely on public
components. Code and trained models will be released at
https://github.com/Nikolai10/PerCoV2.Summary
AI-Generated Summary