具表達力的全身三維高斯化身
Expressive Whole-Body 3D Gaussian Avatar
July 31, 2024
作者: Gyeongsik Moon, Takaaki Shiratori, Shunsuke Saito
cs.AI
摘要
面部表情和手部動作對於表達我們的情感並與世界互動至關重要。然而,從隨意拍攝的視頻中建模的大多數3D人類化身僅支持身體運動,而沒有面部表情和手部動作。在這項工作中,我們提出了ExAvatar,這是從短單眼視頻中學習的具有表現力的全身3D人類化身。我們將ExAvatar設計為整體身體參數化網格模型(SMPL-X)和3D高斯擴散(3DGS)的組合。主要挑戰在於:1)視頻中面部表情和姿勢的多樣性有限,2)缺乏3D觀察,如3D掃描和RGBD圖像。視頻中的多樣性有限使得具有新穎面部表情和姿勢的動畫變得複雜。此外,缺乏3D觀察可能導致在視頻中未觀察到的人體部位存在顯著的模糊性,這可能在新穎運動下產生明顯的瑕疵。為了應對這些問題,我們引入了網格和3D高斯的混合表示。我們的混合表示將每個3D高斯視為表面上的一個頂點,並在它們之間使用預定義的連接信息(即三角形面)來遵循SMPL-X的網格拓撲。這使得我們的ExAvatar可以通過受SMPL-X面部表情空間驅動來具有新穎的面部表情。此外,通過使用基於連接性的正則化器,我們顯著減少了新穎面部表情和姿勢中的瑕疵。
English
Facial expression and hand motions are necessary to express our emotions and
interact with the world. Nevertheless, most of the 3D human avatars modeled
from a casually captured video only support body motions without facial
expressions and hand motions.In this work, we present ExAvatar, an expressive
whole-body 3D human avatar learned from a short monocular video. We design
ExAvatar as a combination of the whole-body parametric mesh model (SMPL-X) and
3D Gaussian Splatting (3DGS). The main challenges are 1) a limited diversity of
facial expressions and poses in the video and 2) the absence of 3D
observations, such as 3D scans and RGBD images. The limited diversity in the
video makes animations with novel facial expressions and poses non-trivial. In
addition, the absence of 3D observations could cause significant ambiguity in
human parts that are not observed in the video, which can result in noticeable
artifacts under novel motions. To address them, we introduce our hybrid
representation of the mesh and 3D Gaussians. Our hybrid representation treats
each 3D Gaussian as a vertex on the surface with pre-defined connectivity
information (i.e., triangle faces) between them following the mesh topology of
SMPL-X. It makes our ExAvatar animatable with novel facial expressions by
driven by the facial expression space of SMPL-X. In addition, by using
connectivity-based regularizers, we significantly reduce artifacts in novel
facial expressions and poses.Summary
AI-Generated Summary