基于掩码退化分类的通用图像修复预训练
Universal Image Restoration Pre-training via Masked Degradation Classification
October 15, 2025
作者: JiaKui Hu, Zhengjian Yao, Lujia Jin, Yinghao Chen, Yanye Lu
cs.AI
摘要
本研究提出了一种掩码退化分类预训练方法(MaskDCPT),旨在促进输入图像中退化类型的分类,从而实现全面的图像复原预训练。与传统的预训练方法不同,MaskDCPT将图像的退化类型作为极弱监督信号,同时利用图像重建来提升性能和鲁棒性。MaskDCPT包含一个编码器和两个解码器:编码器从掩码的低质量输入图像中提取特征;分类解码器利用这些特征识别退化类型,而重建解码器则致力于重建对应的高质量图像。这一设计使得预训练能够同时受益于掩码图像建模和对比学习,生成适用于复原任务的通用表示。得益于简洁而强大的MaskDCPT,预训练后的编码器可用于解决通用图像复原问题,并取得卓越性能。实施MaskDCPT显著提升了卷积神经网络(CNNs)和Transformer的性能,在5D一体化复原任务中PSNR至少提高了3.77 dB,在真实世界退化场景下PIQE相比基线降低了34.8%。此外,该方法对先前未见过的退化类型和级别展现出强大的泛化能力。我们还整理并发布了UIR-2.5M数据集,包含250万对复原样本,涵盖19种退化类型和超过200个退化级别,融合了合成与真实世界数据。数据集、源代码及模型可在https://github.com/MILab-PKU/MaskDCPT获取。
English
This study introduces a Masked Degradation Classification Pre-Training method
(MaskDCPT), designed to facilitate the classification of degradation types in
input images, leading to comprehensive image restoration pre-training. Unlike
conventional pre-training methods, MaskDCPT uses the degradation type of the
image as an extremely weak supervision, while simultaneously leveraging the
image reconstruction to enhance performance and robustness. MaskDCPT includes
an encoder and two decoders: the encoder extracts features from the masked
low-quality input image. The classification decoder uses these features to
identify the degradation type, whereas the reconstruction decoder aims to
reconstruct a corresponding high-quality image. This design allows the
pre-training to benefit from both masked image modeling and contrastive
learning, resulting in a generalized representation suited for restoration
tasks. Benefit from the straightforward yet potent MaskDCPT, the pre-trained
encoder can be used to address universal image restoration and achieve
outstanding performance. Implementing MaskDCPT significantly improves
performance for both convolution neural networks (CNNs) and Transformers, with
a minimum increase in PSNR of 3.77 dB in the 5D all-in-one restoration task and
a 34.8% reduction in PIQE compared to baseline in real-world degradation
scenarios. It also emergences strong generalization to previously unseen
degradation types and levels. In addition, we curate and release the UIR-2.5M
dataset, which includes 2.5 million paired restoration samples across 19
degradation types and over 200 degradation levels, incorporating both synthetic
and real-world data. The dataset, source code, and models are available at
https://github.com/MILab-PKU/MaskDCPT.