HybridNeRF: 適応的体積サーフェスによる効率的なニューラルレンダリング

要旨

ニューラルラジアンスフィールド（NeRF）は、最新鋭のビュー合成品質を提供しますが、レンダリングが遅くなる傾向があります。その理由の一つは、ボリュームレンダリングを利用しているため、レンダリング時にレイごとに多くのサンプル（およびモデルクエリ）を必要とすることです。この表現は柔軟で最適化が容易ですが、現実世界のほとんどのオブジェクトは、ボリュームではなくサーフェスでモデル化する方が効率的であり、レイごとに必要なサンプル数を大幅に削減できます。この観察から、符号付き距離関数（SDF）などのサーフェス表現が大きく進歩しましたが、これらの手法は半透明や薄い構造のモデル化に苦戦する可能性があります。私たちは、HybridNeRFという手法を提案します。この手法は、ほとんどのオブジェクトをサーフェスとしてレンダリングしつつ、困難な領域（通常は小さい部分）をボリューム的にモデル化することで、両方の表現の長所を活用します。HybridNeRFを、Eyeful Towerデータセットや他の一般的に使用されるビュー合成データセットに対して評価しました。最新のベースライン（最近のラスタライゼーションベースのアプローチを含む）と比較すると、エラー率を15-30%改善しつつ、仮想現実解像度（2Kx2K）でリアルタイムフレームレート（少なくとも36 FPS）を達成しました。

English

Neural radiance fields provide state-of-the-art view synthesis quality but tend to be slow to render. One reason is that they make use of volume rendering, thus requiring many samples (and model queries) per ray at render time. Although this representation is flexible and easy to optimize, most real-world objects can be modeled more efficiently with surfaces instead of volumes, requiring far fewer samples per ray. This observation has spurred considerable progress in surface representations such as signed distance functions, but these may struggle to model semi-opaque and thin structures. We propose a method, HybridNeRF, that leverages the strengths of both representations by rendering most objects as surfaces while modeling the (typically) small fraction of challenging regions volumetrically. We evaluate HybridNeRF against the challenging Eyeful Tower dataset along with other commonly used view synthesis datasets. When comparing to state-of-the-art baselines, including recent rasterization-based approaches, we improve error rates by 15-30% while achieving real-time framerates (at least 36 FPS) for virtual-reality resolutions (2Kx2K).

HybridNeRF: 適応的体積サーフェスによる効率的なニューラルレンダリング

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

要旨

Support