RestoreFormer++：实现从未经降级的键值对进行真实世界盲目人脸修复

摘要

盲目人脸修复旨在从具有未知退化的图像中恢复高质量的人脸图像。当前算法主要引入先验来补充高质量细节并取得令人印象深刻的进展。然而，大多数这些算法忽略了人脸中丰富的上下文信息及其与先验的相互作用，导致性能次优。此外，它们较少关注合成与真实场景之间的差距，限制了对真实世界应用的鲁棒性和泛化能力。在这项工作中，我们提出了RestoreFormer++，一方面引入全空间注意机制来建模上下文信息及其与先验的相互作用，另一方面探索了一个扩展退化模型，以帮助生成更真实的退化人脸图像，从而减轻合成到真实世界的差距。与当前算法相比，RestoreFormer++ 具有几个关键优势。首先，我们引入了多头交叉注意力机制来完全探索受损信息与高质量先验之间的空间交互，而不是像传统视觉变换器那样使用多头自注意力机制。通过这种方式，它可以促进RestoreFormer++以更高的真实感和保真度恢复人脸图像。其次，与面向识别的字典相反，我们学习了一个面向重建的字典作为先验，其中包含更多多样化的高质量面部细节，并更符合修复目标。第三，我们引入了一个包含更多真实退化场景的扩展退化模型，用于训练数据合成，从而有助于增强我们的RestoreFormer++模型的鲁棒性和泛化能力。大量实验证明，RestoreFormer++在合成和真实世界数据集上均优于现有算法。

English

Blind face restoration aims at recovering high-quality face images from those with unknown degradations. Current algorithms mainly introduce priors to complement high-quality details and achieve impressive progress. However, most of these algorithms ignore abundant contextual information in the face and its interplay with the priors, leading to sub-optimal performance. Moreover, they pay less attention to the gap between the synthetic and real-world scenarios, limiting the robustness and generalization to real-world applications. In this work, we propose RestoreFormer++, which on the one hand introduces fully-spatial attention mechanisms to model the contextual information and the interplay with the priors, and on the other hand, explores an extending degrading model to help generate more realistic degraded face images to alleviate the synthetic-to-real-world gap. Compared with current algorithms, RestoreFormer++ has several crucial benefits. First, instead of using a multi-head self-attention mechanism like the traditional visual transformer, we introduce multi-head cross-attention over multi-scale features to fully explore spatial interactions between corrupted information and high-quality priors. In this way, it can facilitate RestoreFormer++ to restore face images with higher realness and fidelity. Second, in contrast to the recognition-oriented dictionary, we learn a reconstruction-oriented dictionary as priors, which contains more diverse high-quality facial details and better accords with the restoration target. Third, we introduce an extending degrading model that contains more realistic degraded scenarios for training data synthesizing, and thus helps to enhance the robustness and generalization of our RestoreFormer++ model. Extensive experiments show that RestoreFormer++ outperforms state-of-the-art algorithms on both synthetic and real-world datasets.

RestoreFormer++：实现从未经降级的键值对进行真实世界盲目人脸修复

RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs

摘要

Support