TaoAvatar:基於3D高斯潑濺技術的即時逼真全身對話虛擬角色,用於擴增實境
TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
March 21, 2025
作者: Jianchuan Chen, Jingchuan Hu, Gaige Wang, Zhonghua Jiang, Tiansong Zhou, Zhiwen Chen, Chengfei Lv
cs.AI
摘要
逼真的3D全身對話虛擬人物在增強現實(AR)領域具有巨大潛力,其應用範圍從電子商務直播到全息通訊無所不包。儘管3D高斯潑濺(3DGS)技術在創建逼真虛擬人物方面取得了進展,但現有方法在全身對話任務中對面部表情和身體動作的細粒度控制上仍存在困難。此外,這些方法往往缺乏足夠的細節,且無法在移動設備上實時運行。我們提出了TaoAvatar,這是一個基於3DGS的高保真、輕量級全身對話虛擬人物,由多種信號驅動。我們的方法首先創建一個個性化的著衣人體參數模板,將高斯分佈綁定以表示外觀。接著,我們預訓練一個基於StyleUnet的網絡來處理複雜的姿態依賴非剛性變形,該網絡能夠捕捉高頻外觀細節,但對移動設備來說資源消耗過大。為解決這一問題,我們採用蒸餾技術將非剛性變形“烘焙”到一個輕量級的基於MLP的網絡中,並開發混合形狀來補償細節。大量實驗表明,TaoAvatar在各種設備上實現了實時運行的同時,達到了頂尖的渲染質量,在Apple Vision Pro等高清晰度立體設備上保持90 FPS的幀率。
English
Realistic 3D full-body talking avatars hold great potential in AR, with
applications ranging from e-commerce live streaming to holographic
communication. Despite advances in 3D Gaussian Splatting (3DGS) for lifelike
avatar creation, existing methods struggle with fine-grained control of facial
expressions and body movements in full-body talking tasks. Additionally, they
often lack sufficient details and cannot run in real-time on mobile devices. We
present TaoAvatar, a high-fidelity, lightweight, 3DGS-based full-body talking
avatar driven by various signals. Our approach starts by creating a
personalized clothed human parametric template that binds Gaussians to
represent appearances. We then pre-train a StyleUnet-based network to handle
complex pose-dependent non-rigid deformation, which can capture high-frequency
appearance details but is too resource-intensive for mobile devices. To
overcome this, we "bake" the non-rigid deformations into a lightweight
MLP-based network using a distillation technique and develop blend shapes to
compensate for details. Extensive experiments show that TaoAvatar achieves
state-of-the-art rendering quality while running in real-time across various
devices, maintaining 90 FPS on high-definition stereo devices such as the Apple
Vision Pro.Summary
AI-Generated Summary