基於掩碼退化分類的通用圖像修復預訓練
Universal Image Restoration Pre-training via Masked Degradation Classification
October 15, 2025
作者: JiaKui Hu, Zhengjian Yao, Lujia Jin, Yinghao Chen, Yanye Lu
cs.AI
摘要
本研究提出了一種名為遮罩退化分類預訓練方法(MaskDCPT),旨在促進輸入圖像中退化類型的分類,從而實現全面的圖像復原預訓練。與傳統的預訓練方法不同,MaskDCPT利用圖像的退化類型作為極弱的監督信號,同時通過圖像重建來提升性能和魯棒性。MaskDCPT包含一個編碼器和兩個解碼器:編碼器從遮罩的低質量輸入圖像中提取特徵;分類解碼器利用這些特徵來識別退化類型,而重建解碼器則致力於重建相應的高質量圖像。這種設計使得預訓練能夠同時受益於遮罩圖像建模和對比學習,從而生成適合復原任務的通用表示。得益於簡潔而強大的MaskDCPT,預訓練後的編碼器可用於解決通用圖像復原問題,並取得卓越的性能。實施MaskDCPT顯著提升了卷積神經網絡(CNNs)和Transformer的性能,在5D全能復原任務中PSNR至少提高了3.77 dB,在真實世界退化場景中PIQE相比基線降低了34.8%。此外,它還展現出對先前未見過的退化類型和級別的強大泛化能力。此外,我們整理並發布了UIR-2.5M數據集,該數據集包含250萬對復原樣本,涵蓋19種退化類型和超過200個退化級別,結合了合成數據和真實世界數據。數據集、源代碼和模型可在https://github.com/MILab-PKU/MaskDCPT獲取。
English
This study introduces a Masked Degradation Classification Pre-Training method
(MaskDCPT), designed to facilitate the classification of degradation types in
input images, leading to comprehensive image restoration pre-training. Unlike
conventional pre-training methods, MaskDCPT uses the degradation type of the
image as an extremely weak supervision, while simultaneously leveraging the
image reconstruction to enhance performance and robustness. MaskDCPT includes
an encoder and two decoders: the encoder extracts features from the masked
low-quality input image. The classification decoder uses these features to
identify the degradation type, whereas the reconstruction decoder aims to
reconstruct a corresponding high-quality image. This design allows the
pre-training to benefit from both masked image modeling and contrastive
learning, resulting in a generalized representation suited for restoration
tasks. Benefit from the straightforward yet potent MaskDCPT, the pre-trained
encoder can be used to address universal image restoration and achieve
outstanding performance. Implementing MaskDCPT significantly improves
performance for both convolution neural networks (CNNs) and Transformers, with
a minimum increase in PSNR of 3.77 dB in the 5D all-in-one restoration task and
a 34.8% reduction in PIQE compared to baseline in real-world degradation
scenarios. It also emergences strong generalization to previously unseen
degradation types and levels. In addition, we curate and release the UIR-2.5M
dataset, which includes 2.5 million paired restoration samples across 19
degradation types and over 200 degradation levels, incorporating both synthetic
and real-world data. The dataset, source code, and models are available at
https://github.com/MILab-PKU/MaskDCPT.