全面重光照:可泛化且一致的單目人體重光照與和諧化
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
April 3, 2025
作者: Junying Wang, Jingyuan Liu, Xin Sun, Krishna Kumar Singh, Zhixin Shu, He Zhang, Jimei Yang, Nanxuan Zhao, Tuanfeng Y. Wang, Simon S. Chen, Ulrich Neumann, Jae Shin Yoon
cs.AI
摘要
本文介紹了全面重光照技術,這是首個能夠從任意場景中的人體圖像或視頻中控制並協調光照的一體化方法。由於數據集的缺乏,構建這樣一個通用模型極具挑戰性,這使得現有的基於圖像的重光照模型僅限於特定場景(例如,面部或靜態人體)。為應對這一挑戰,我們重新利用預訓練的擴散模型作為通用圖像先驗,並在從粗到精的框架中聯合建模人體重光照與背景協調。為了進一步增強重光照的時間一致性,我們引入了一種無監督的時間光照模型,該模型從大量真實世界視頻中學習光照週期一致性,而無需任何地面真值。在推理階段,我們通過時空特徵融合算法將時間光照模塊與擴散模型結合,無需額外訓練;並應用一種新的引導細化作為後處理,以保留輸入圖像中的高頻細節。實驗結果顯示,全面重光照技術展現出強大的通用性和光照時間一致性,優於現有的基於圖像的人體重光照與協調方法。
English
This paper introduces Comprehensive Relighting, the first all-in-one approach
that can both control and harmonize the lighting from an image or video of
humans with arbitrary body parts from any scene. Building such a generalizable
model is extremely challenging due to the lack of dataset, restricting existing
image-based relighting models to a specific scenario (e.g., face or static
human). To address this challenge, we repurpose a pre-trained diffusion model
as a general image prior and jointly model the human relighting and background
harmonization in the coarse-to-fine framework. To further enhance the temporal
coherence of the relighting, we introduce an unsupervised temporal lighting
model that learns the lighting cycle consistency from many real-world videos
without any ground truth. In inference time, our temporal lighting module is
combined with the diffusion models through the spatio-temporal feature blending
algorithms without extra training; and we apply a new guided refinement as a
post-processing to preserve the high-frequency details from the input image. In
the experiments, Comprehensive Relighting shows a strong generalizability and
lighting temporal coherence, outperforming existing image-based human
relighting and harmonization methods.Summary
AI-Generated Summary