全面重光照：可泛化且一致的单目人体重光照与和谐化

摘要

本文提出了全面重光照技术，这是首个能够从任意场景中的人体图像或视频中控制并协调光照的一体化方法。由于数据集的缺乏，构建这样一个通用模型极具挑战性，现有的基于图像的重光照模型通常局限于特定场景（如面部或静态人体）。为解决这一难题，我们重新利用预训练的扩散模型作为通用图像先验，并在由粗到细的框架中联合建模人体重光照与背景协调。为进一步增强重光照的时间一致性，我们引入了一种无监督的时间光照模型，该模型从大量真实世界视频中学习光照周期一致性，无需任何真实标签。在推理阶段，我们的时间光照模块通过时空特征融合算法与扩散模型结合，无需额外训练；同时，我们采用了一种新的引导细化作为后处理步骤，以保留输入图像中的高频细节。实验表明，全面重光照技术展现出强大的通用性和光照时间一致性，超越了现有的基于图像的人体重光照与协调方法。

English

This paper introduces Comprehensive Relighting, the first all-in-one approach that can both control and harmonize the lighting from an image or video of humans with arbitrary body parts from any scene. Building such a generalizable model is extremely challenging due to the lack of dataset, restricting existing image-based relighting models to a specific scenario (e.g., face or static human). To address this challenge, we repurpose a pre-trained diffusion model as a general image prior and jointly model the human relighting and background harmonization in the coarse-to-fine framework. To further enhance the temporal coherence of the relighting, we introduce an unsupervised temporal lighting model that learns the lighting cycle consistency from many real-world videos without any ground truth. In inference time, our temporal lighting module is combined with the diffusion models through the spatio-temporal feature blending algorithms without extra training; and we apply a new guided refinement as a post-processing to preserve the high-frequency details from the input image. In the experiments, Comprehensive Relighting shows a strong generalizability and lighting temporal coherence, outperforming existing image-based human relighting and harmonization methods.

全面重光照：可泛化且一致的单目人体重光照与和谐化

Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization

摘要

Support