BRDFusion: 物理と生成の融合による都市シーンの逆レンダリング

要旨

キャプチャされた動画からの都市景観の逆レンダリングは、コンテンツ制作や自動運転シミュレーションを含む多くのアプリケーションを可能にします。物理ベースのレンダリング手法は照明の物理法則に従い制御可能ですが、再構成やレンダリングにおけるアーティファクトに悩まされます。一方、生成モデルは現実的な動画を生成しますが、一貫性や制御性には限界があります。本稿では、逆レンダリングと順方向レンダリングのための相補的な2つのモデルを統合するフレームワーク「BRDFusion」を提案します。具体的には、BRDFusionは物理モデリングにより明示的で一貫性のあるシーン特性を復元し、生成事前分布を用いて最適化の曖昧さを軽減します。順方向レンダリング時には、物理モデルがシーン構成に基づく制御可能なレンダリングを提供し、生成モデルがノイズ除去とアーティファクト修正を行います。これにより、本手法は高品質な動画を生成しつつ、精密な制御を可能とし、実シーンおよび合成シーンのベースライン手法を上回ります。さらに、BRDFusionは新視点におけるリライティング、夜間シミュレーション、動的オブジェクトの挿入・編集をサポートします。プロジェクトページ: https://shigon255.github.io/brdfusion-page/

English

Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models produce realistic videos, they offer limited consistency and controllability. We present BRDFusion, a unified framework that combines two complementary models for inverse and forward rendering. Specifically, BRDFusion recovers explicit, consistent scene properties with physical modeling and alleviates optimization ambiguity with generative priors. During forward rendering, the physical model provides controllable rendering from the scene configuration, and the generative model denoises and fixes artifacts. Therefore, our method produces high-quality videos while allowing precise control, outperforming baselines in real and synthetic scenes. Moreover, BRDFusion supports novel-view relighting, night simulation, and dynamic object insertion/editing. Project page: https://shigon255.github.io/brdfusion-page/