通过扩散引导的逆渲染实现逼真物体插入

摘要

在真实场景图像中正确插入虚拟对象需要对场景的光照、几何和材质以及图像形成过程有深入的理解。尽管最近的大规模扩散模型展现出强大的生成和修复能力，但我们发现当前模型并不足以在单张图片中足够地"理解"场景，以生成一致的光照效果（阴影、明亮反射等），同时保留合成对象的身份和细节。我们提出使用个性化的大规模扩散模型作为物理反渲染过程的指导。我们的方法恢复场景光照和色调映射参数，从而实现在室内或室外场景的单帧图像或视频中逼真地合成任意虚拟对象。我们基于物理的流程进一步实现了自动材质和色调映射的细化。

English

The correct insertion of virtual objects in images of real-world scenes requires a deep understanding of the scene's lighting, geometry and materials, as well as the image formation process. While recent large-scale diffusion models have shown strong generative and inpainting capabilities, we find that current models do not sufficiently "understand" the scene shown in a single picture to generate consistent lighting effects (shadows, bright reflections, etc.) while preserving the identity and details of the composited object. We propose using a personalized large diffusion model as guidance to a physically based inverse rendering process. Our method recovers scene lighting and tone-mapping parameters, allowing the photorealistic composition of arbitrary virtual objects in single frames or videos of indoor or outdoor scenes. Our physically based pipeline further enables automatic materials and tone-mapping refinement.

通过扩散引导的逆渲染实现逼真物体插入

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

摘要

Support