可行駕駛的三維高斯化身

摘要

我們提出了可駕駛的三維高斯化身（D3GA），這是第一個使用高斯斑點渲染的人體三維可控模型。目前逼真的可駕駛化身在訓練期間要求準確的三維配准、測試期間要求密集的輸入圖像，有時兩者兼具。基於神經輻射場的模型在遠程存在應用中往往速度過慢。本研究利用最近提出的三維高斯斑點（3DGS）技術，在實時幀速下渲染出逼真的人類形象，使用密集校準的多視角視頻作為輸入。為了變形這些基本形狀，我們不再使用常見的線性混合蒙皮（LBS）的點變形方法，而是採用經典的體積變形方法：籠狀變形。鑒於其較小的尺寸，我們使用關節角度和關鍵點來驅動這些變形，這對於通信應用更為適合。在九位具有不同體形、服裝和動作的受試者上進行的實驗表明，當使用相同的訓練和測試數據時，我們的方法獲得了比最先進方法更高質量的結果。

English

We present Drivable 3D Gaussian Avatars (D3GA), the first 3D controllable model for human bodies rendered with Gaussian splats. Current photorealistic drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both. The ones based on neural radiance fields also tend to be prohibitively slow for telepresence applications. This work uses the recently presented 3D Gaussian Splatting (3DGS) technique to render realistic humans at real-time framerates, using dense calibrated multi-view videos as input. To deform those primitives, we depart from the commonly used point deformation method of linear blend skinning (LBS) and use a classic volumetric deformation method: cage deformations. Given their smaller size, we drive these deformations with joint angles and keypoints, which are more suitable for communication applications. Our experiments on nine subjects with varied body shapes, clothes, and motions obtain higher-quality results than state-of-the-art methods when using the same training and test data.