DiffIR2VR-Zero:基于扩散的图像恢复模型的零样本视频修复
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models
July 1, 2024
作者: Chang-Han Yeh, Chin-Yang Lin, Zhixiang Wang, Chi-Wei Hsiao, Ting-Hsuan Chen, Yu-Lun Liu
cs.AI
摘要
本文介绍了一种利用预训练图像恢复扩散模型进行零样本视频恢复的方法。传统视频恢复方法通常需要针对不同设置进行重新训练,并且在各种退化类型和数据集之间存在有限的泛化能力。我们的方法使用分层令牌合并策略来处理关键帧和局部帧,结合了混合对应机制,将光流和基于特征的最近邻匹配(潜在合并)相结合。我们展示了我们的方法不仅在零样本视频恢复方面取得了最佳性能,而且在跨多个数据集和极端退化(8倍超分辨率和高标准差视频降噪)方面显著超越了经过训练的模型的泛化能力。我们通过定量指标和在各种具有挑战性的数据集上的视觉比较提供了证据。此外,我们的技术适用于任何2D恢复扩散模型,为视频增强任务提供了一种多功能且强大的工具,无需进行大量重新训练。这项研究促进了更高效且广泛适用的视频恢复技术的发展,支持需要高质量视频输出的领域的进步。请访问我们的项目页面以查看视频结果:https://jimmycv07.github.io/DiffIR2VR_web/。
English
This paper introduces a method for zero-shot video restoration using
pre-trained image restoration diffusion models. Traditional video restoration
methods often need retraining for different settings and struggle with limited
generalization across various degradation types and datasets. Our approach uses
a hierarchical token merging strategy for keyframes and local frames, combined
with a hybrid correspondence mechanism that blends optical flow and
feature-based nearest neighbor matching (latent merging). We show that our
method not only achieves top performance in zero-shot video restoration but
also significantly surpasses trained models in generalization across diverse
datasets and extreme degradations (8times super-resolution and high-standard
deviation video denoising). We present evidence through quantitative metrics
and visual comparisons on various challenging datasets. Additionally, our
technique works with any 2D restoration diffusion model, offering a versatile
and powerful tool for video enhancement tasks without extensive retraining.
This research leads to more efficient and widely applicable video restoration
technologies, supporting advancements in fields that require high-quality video
output. See our project page for video results at
https://jimmycv07.github.io/DiffIR2VR_web/.Summary
AI-Generated Summary