透過擴散引導的反渲染技術實現逼真物件插入
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering
August 19, 2024
作者: Ruofan Liang, Zan Gojcic, Merlin Nimier-David, David Acuna, Nandita Vijaykumar, Sanja Fidler, Zian Wang
cs.AI
摘要
在真實世界場景的影像中正確插入虛擬物體,需要對場景的照明、幾何和材質以及影像形成過程有深入的理解。儘管最近的大規模擴散模型展現出強大的生成和修補能力,但我們發現目前的模型並不足以在單張圖片中足夠地"理解"場景,以生成一致的照明效果(陰影、明亮反射等),同時保留合成物體的身份和細節。我們提出使用個性化的大型擴散模型作為物理反渲染過程的指導。我們的方法恢復了場景照明和色調映射參數,允許在室內或室外場景的單幀或視頻中逼真地合成任意虛擬物體。我們基於物理的流程進一步實現了自動材質和色調映射的精細化。
English
The correct insertion of virtual objects in images of real-world scenes
requires a deep understanding of the scene's lighting, geometry and materials,
as well as the image formation process. While recent large-scale diffusion
models have shown strong generative and inpainting capabilities, we find that
current models do not sufficiently "understand" the scene shown in a single
picture to generate consistent lighting effects (shadows, bright reflections,
etc.) while preserving the identity and details of the composited object. We
propose using a personalized large diffusion model as guidance to a physically
based inverse rendering process. Our method recovers scene lighting and
tone-mapping parameters, allowing the photorealistic composition of arbitrary
virtual objects in single frames or videos of indoor or outdoor scenes. Our
physically based pipeline further enables automatic materials and tone-mapping
refinement.Summary
AI-Generated Summary