NPGA：神經參數高斯化身

摘要

在將虛擬元件進一步整合到我們日常生活中的過程中，創建高保真度的數位人頭版本是一個重要的步驟。構建這樣的化身是一個具有挑戰性的研究問題，因為對於逼真度和實時渲染性能有著很高的需求。在這項工作中，我們提出了神經參數高斯化身（NPGA），這是一種從多視角視頻錄製中創建高保真度、可控制化身的數據驅動方法。我們圍繞著三維高斯飛濺建立了我們的方法，因為它具有高效的渲染能力，並且繼承了點雲的拓撲靈活性。與以往的工作相比，我們將我們的化身動態條件建立在神經參數頭部模型（NPHM）豐富的表情空間上，而不是基於網格的三維形狀模型（3DMMs）。為此，我們將底層NPHM的反向變形場提煉為與光柵化渲染兼容的正向變形。所有其餘的細節和表情相關細節都是從多視角視頻中學習的。為了增加我們化身的表徵能力，我們使用每個基元潛在特徵來擴充規範高斯點雲，這些特徵控制其動態行為。為了規範這種增強的動態表達能力，我們提出了潛在特徵和預測動態的拉普拉斯項。我們在公開的NeRSemble數據集上評估了我們的方法，展示了NPGA在自我再現任務中比以前的最先進化身表現提高了2.6 PSNR。此外，我們展示了從現實單眼視頻中準確的動畫能力。

English

The creation of high-fidelity, digital versions of human heads is an important stepping stone in the process of further integrating virtual components into our everyday lives. Constructing such avatars is a challenging research problem, due to a high demand for photo-realism and real-time rendering performance. In this work, we propose Neural Parametric Gaussian Avatars (NPGA), a data-driven approach to create high-fidelity, controllable avatars from multi-view video recordings. We build our method around 3D Gaussian Splatting for its highly efficient rendering and to inherit the topological flexibility of point clouds. In contrast to previous work, we condition our avatars' dynamics on the rich expression space of neural parametric head models (NPHM), instead of mesh-based 3DMMs. To this end, we distill the backward deformation field of our underlying NPHM into forward deformations which are compatible with rasterization-based rendering. All remaining fine-scale, expression-dependent details are learned from the multi-view videos. To increase the representational capacity of our avatars, we augment the canonical Gaussian point cloud using per-primitive latent features which govern its dynamic behavior. To regularize this increased dynamic expressivity, we propose Laplacian terms on the latent features and predicted dynamics. We evaluate our method on the public NeRSemble dataset, demonstrating that NPGA significantly outperforms the previous state-of-the-art avatars on the self-reenactment task by 2.6 PSNR. Furthermore, we demonstrate accurate animation capabilities from real-world monocular videos.

NPGA：神經參數高斯化身

NPGA: Neural Parametric Gaussian Avatars

摘要

Support