全面重光照:可泛化且一致的单目人体重光照与和谐化
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
April 3, 2025
作者: Junying Wang, Jingyuan Liu, Xin Sun, Krishna Kumar Singh, Zhixin Shu, He Zhang, Jimei Yang, Nanxuan Zhao, Tuanfeng Y. Wang, Simon S. Chen, Ulrich Neumann, Jae Shin Yoon
cs.AI
摘要
本文提出了全面重光照技术,这是首个能够从任意场景中的人体图像或视频中控制并协调光照的一体化方法。由于数据集的缺乏,构建这样一个通用模型极具挑战性,现有的基于图像的重光照模型通常局限于特定场景(如面部或静态人体)。为解决这一难题,我们重新利用预训练的扩散模型作为通用图像先验,并在由粗到细的框架中联合建模人体重光照与背景协调。为进一步增强重光照的时间一致性,我们引入了一种无监督的时间光照模型,该模型从大量真实世界视频中学习光照周期一致性,无需任何真实标签。在推理阶段,我们的时间光照模块通过时空特征融合算法与扩散模型结合,无需额外训练;同时,我们采用了一种新的引导细化作为后处理步骤,以保留输入图像中的高频细节。实验表明,全面重光照技术展现出强大的通用性和光照时间一致性,超越了现有的基于图像的人体重光照与协调方法。
English
This paper introduces Comprehensive Relighting, the first all-in-one approach
that can both control and harmonize the lighting from an image or video of
humans with arbitrary body parts from any scene. Building such a generalizable
model is extremely challenging due to the lack of dataset, restricting existing
image-based relighting models to a specific scenario (e.g., face or static
human). To address this challenge, we repurpose a pre-trained diffusion model
as a general image prior and jointly model the human relighting and background
harmonization in the coarse-to-fine framework. To further enhance the temporal
coherence of the relighting, we introduce an unsupervised temporal lighting
model that learns the lighting cycle consistency from many real-world videos
without any ground truth. In inference time, our temporal lighting module is
combined with the diffusion models through the spatio-temporal feature blending
algorithms without extra training; and we apply a new guided refinement as a
post-processing to preserve the high-frequency details from the input image. In
the experiments, Comprehensive Relighting shows a strong generalizability and
lighting temporal coherence, outperforming existing image-based human
relighting and harmonization methods.Summary
AI-Generated Summary