BRDFusion: 물리 기반 생성과 도시 장면 역 렌더링의 만남

초록

촬영된 비디오로부터 도시 장면의 역렌더링은 콘텐츠 제작 및 자율주행 시뮬레이션을 포함한 다양한 응용을 가능하게 한다. 물리 기반 렌더링 방법은 조명 물리를 따르고 제어할 수 있으나 재구성 및 렌더링 아티팩트로 인해 어려움을 겪는다. 생성 모델은 사실적인 비디오를 생성하지만 일관성과 제어 가능성 측면에서 제한적이다. 본 논문에서는 역렌더링과 순방향 렌더링을 위한 두 가지 상호 보완적 모델을 결합한 통합 프레임워크인 BRDFusion을 제시한다. 구체적으로 BRDFusion은 물리적 모델링을 통해 명시적이고 일관된 장면 특성을 복원하고, 생성적 사전 정보(generative priors)를 활용하여 최적화의 모호성을 완화한다. 순방향 렌더링 중에는 물리적 모델이 장면 구성에 따른 제어 가능한 렌더링을 제공하고, 생성 모델이 노이즈를 제거하고 아티팩트를 수정한다. 따라서 본 방법은 정밀한 제어를 가능하게 하면서 고품질 비디오를 생성하며, 실제 및 합성 장면에서 기준선(baselines)보다 우수한 성능을 보인다. 또한 BRDFusion은 새로운 시점에서의 재조명, 야간 시뮬레이션, 동적 객체 삽입/편집을 지원한다. 프로젝트 페이지: https://shigon255.github.io/brdfusion-page/

English

Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models produce realistic videos, they offer limited consistency and controllability. We present BRDFusion, a unified framework that combines two complementary models for inverse and forward rendering. Specifically, BRDFusion recovers explicit, consistent scene properties with physical modeling and alleviates optimization ambiguity with generative priors. During forward rendering, the physical model provides controllable rendering from the scene configuration, and the generative model denoises and fixes artifacts. Therefore, our method produces high-quality videos while allowing precise control, outperforming baselines in real and synthetic scenes. Moreover, BRDFusion supports novel-view relighting, night simulation, and dynamic object insertion/editing. Project page: https://shigon255.github.io/brdfusion-page/