DiffIR2VR-Zero:使用基於擴散的影像修復模型進行零樣本視頻修復
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models
July 1, 2024
作者: Chang-Han Yeh, Chin-Yang Lin, Zhixiang Wang, Chi-Wei Hsiao, Ting-Hsuan Chen, Yu-Lun Liu
cs.AI
摘要
本文介紹了一種使用預先訓練的影像修復擴散模型進行零樣本視頻修復的方法。傳統的視頻修復方法通常需要針對不同設置重新訓練,並且在各種降解類型和數據集之間的泛化能力有限。我們的方法採用了一種用於關鍵幀和局部幀的階層式標記合併策略,結合了一種混合對應機制,將光流和基於特徵的最近鄰匹配(潛在合併)相結合。我們展示了我們的方法不僅在零樣本視頻修復方面取得了頂尖表現,而且在各種數據集和極端降解(8倍超分辨率和高標準差視頻降噪)方面明顯優於訓練模型的泛化能力。我們通過定量指標和各種具有挑戰性的數據集上的視覺比較提供了證據。此外,我們的技術適用於任何2D修復擴散模型,為視頻增強任務提供了一個多功能且強大的工具,無需進行大量重新訓練。這項研究將帶來更高效且廣泛應用的視頻修復技術,支持需要高質量視頻輸出的領域的進步。請查看我們的項目頁面以查看視頻結果:https://jimmycv07.github.io/DiffIR2VR_web/。
English
This paper introduces a method for zero-shot video restoration using
pre-trained image restoration diffusion models. Traditional video restoration
methods often need retraining for different settings and struggle with limited
generalization across various degradation types and datasets. Our approach uses
a hierarchical token merging strategy for keyframes and local frames, combined
with a hybrid correspondence mechanism that blends optical flow and
feature-based nearest neighbor matching (latent merging). We show that our
method not only achieves top performance in zero-shot video restoration but
also significantly surpasses trained models in generalization across diverse
datasets and extreme degradations (8times super-resolution and high-standard
deviation video denoising). We present evidence through quantitative metrics
and visual comparisons on various challenging datasets. Additionally, our
technique works with any 2D restoration diffusion model, offering a versatile
and powerful tool for video enhancement tasks without extensive retraining.
This research leads to more efficient and widely applicable video restoration
technologies, supporting advancements in fields that require high-quality video
output. See our project page for video results at
https://jimmycv07.github.io/DiffIR2VR_web/.Summary
AI-Generated Summary