MoCam: 통합된 새로운 시점 합성을 위한 구조화된 노이즈 제거 동역학

초록

생성적 새로운 시점 합성(Generative novel view synthesis)은 근본적인 딜레마에 직면한다: 기하학적 사전 정보(geometric priors)는 공간적 정렬을 제공하지만 시점 변화에 따라 희소해지고 부정확해지는 반면, 외관 사전 정보(appearance priors)는 시각적 충실도를 제공하지만 기하학적 대응 관계가 부족하다. 기존 방법들은 생성 과정 전반에 걸쳐 기하학적 오류를 전파하거나, 두 가지를 정적으로 융합할 때 신호 충돌을 겪는다. 본 논문에서는 확산 과정(diffusion process) 내에서 기하학에서 외관으로의 조정된 진행을 조율하기 위해 구조화된 잡음 제거 동역학(structured denoising dynamics)을 활용하는 MoCam을 소개한다. MoCam은 먼저 초기 단계에서 기하학적 사전 정보를 활용하여 거친 구조를 고정하고 그 불완전성을 허용한 후, 후기 단계에서 외관 사전 정보로 전환하여 기하학적 오류를 적극적으로 수정하고 세부 사항을 정제한다. 이 설계는 확산 과정 내에서 기하학적 정렬과 외관 정제를 시간적으로 분리함으로써 정적 및 동적 시점 합성을 자연스럽게 통합한다. 실험 결과, MoCam은 특히 포인트 클라우드에 심각한 구멍이나 왜곡이 있을 때 이전 방법들보다 훨씬 뛰어난 성능을 보이며, 강건한 기하-외관 분리(geometry-appearance disentanglement)를 달성함을 입증한다.

English

Generative novel view synthesis faces a fundamental dilemma: geometric priors provide spatial alignment but become sparse and inaccurate under view changes, while appearance priors offer visual fidelity but lack geometric correspondence. Existing methods either propagate geometric errors throughout generation or suffer from signal conflicts when fusing both statically. We introduce MoCam, which employs structured denoising dynamics to orchestrate a coordinated progression from geometry to appearance within the diffusion process.MoCam first leverages geometric priors in early stages to anchor coarse structures and tolerate their incompleteness, then switches to appearance priors in later stages to actively correct geometric errors and refine details. This design naturally unifies static and dynamic view synthesis by temporally decoupling geometric alignment and appearance refinement within the diffusion process.Experiments demonstrate that MoCam significantly outperforms prior methods, particularly when point clouds contain severe holes or distortions, achieving robust geometry-appearance disentanglement.

MoCam: 통합된 새로운 시점 합성을 위한 구조화된 노이즈 제거 동역학

MoCam: Unified Novel View Synthesis via Structured Denoising Dynamics

초록

Support