ViewFusion: 補間によるノイズ除去を介したマルチビュー一貫性の実現に向けて

要旨

拡散モデルを用いた新視点合成は、多様で高品質な画像生成において顕著な可能性を示しています。しかし、これらの主流手法における独立した画像生成プロセスは、複数視点の一貫性を維持する上で課題を引き起こします。この問題に対処するため、我々はViewFusionを提案します。これは、既存の事前学習済み拡散モデルにシームレスに統合可能な、新規の学習不要アルゴリズムです。本手法は、以前に生成された視点を次の視点生成の文脈として暗黙的に活用する自己回帰的アプローチを採用し、新視点生成プロセスにおける堅牢な多視点一貫性を確保します。補間されたノイズ除去を通じて既知視点情報を融合する拡散プロセスにより、本フレームワークは追加のファインチューニングなしで、単一視点条件付きモデルを複数視点条件設定で動作させることに成功しています。広範な実験結果は、ViewFusionが一貫性のある詳細な新視点を生成する上での有効性を実証しています。

English

Novel-view synthesis through diffusion models has demonstrated remarkable potential for generating diverse and high-quality images. Yet, the independent process of image generation in these prevailing methods leads to challenges in maintaining multiple-view consistency. To address this, we introduce ViewFusion, a novel, training-free algorithm that can be seamlessly integrated into existing pre-trained diffusion models. Our approach adopts an auto-regressive method that implicitly leverages previously generated views as context for the next view generation, ensuring robust multi-view consistency during the novel-view generation process. Through a diffusion process that fuses known-view information via interpolated denoising, our framework successfully extends single-view conditioned models to work in multiple-view conditional settings without any additional fine-tuning. Extensive experimental results demonstrate the effectiveness of ViewFusion in generating consistent and detailed novel views.

ViewFusion: 補間によるノイズ除去を介したマルチビュー一貫性の実現に向けて

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

要旨

Support