RenderFormer: 전역 조명을 적용한 삼각형 메쉬의 트랜스포머 기반 신경 렌더링

초록

우리는 삼각형 기반 장면 표현에서 완전한 전역 조명 효과를 포함한 이미지를 직접 렌더링하며, 장면별 학습이나 미세 조정이 필요 없는 RenderFormer라는 신경망 렌더링 파이프라인을 제안합니다. 물리학 중심의 렌더링 접근법 대신, 우리는 반사 속성을 가진 삼각형을 나타내는 토큰 시퀀스가 픽셀 패치를 나타내는 출력 토큰 시퀀스로 변환되는 시퀀스-투-시퀀스 변환으로 렌더링을 공식화합니다. RenderFormer는 두 단계의 파이프라인을 따릅니다: 첫 번째 단계는 삼각형 간 광선 전달을 모델링하는 뷰 독립적 단계이며, 두 번째 단계는 뷰 독립적 단계에서 생성된 삼각형 시퀀스의 지도를 받아 광선 묶음을 나타내는 토큰을 해당 픽셀 값으로 변환하는 뷰 의존적 단계입니다. 두 단계 모두 트랜스포머 아키텍처를 기반으로 하며 최소한의 사전 제약 조건으로 학습됩니다. 우리는 다양한 형태와 광선 전달 복잡도를 가진 장면에서 RenderFormer를 시연하고 평가합니다.

English

We present RenderFormer, a neural rendering pipeline that directly renders an image from a triangle-based representation of a scene with full global illumination effects and that does not require per-scene training or fine-tuning. Instead of taking a physics-centric approach to rendering, we formulate rendering as a sequence-to-sequence transformation where a sequence of tokens representing triangles with reflectance properties is converted to a sequence of output tokens representing small patches of pixels. RenderFormer follows a two stage pipeline: a view-independent stage that models triangle-to-triangle light transport, and a view-dependent stage that transforms a token representing a bundle of rays to the corresponding pixel values guided by the triangle-sequence from the view-independent stage. Both stages are based on the transformer architecture and are learned with minimal prior constraints. We demonstrate and evaluate RenderFormer on scenes with varying complexity in shape and light transport.

RenderFormer: 전역 조명을 적용한 삼각형 메쉬의 트랜스포머 기반 신경 렌더링

RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

초록

Support