快速註冊逼真化身以進行虛擬實境臉部動畫
Fast Registration of Photorealistic Avatars for VR Facial Animation
January 19, 2024
作者: Chaitanya Patel, Shaojie Bai, Te-Li Wang, Jason Saragih, Shih-En Wei
cs.AI
摘要
虛擬實境(VR)展現了社交互動的潛力,其感受比其他媒體更具沉浸感。關鍵在於能夠在戴著VR頭戴式設備時準確地動畫化自己的逼真化頭像。儘管在離線環境中可以實現對特定人物頭像進行高質量註冊到頭戴式攝像機(HMC)圖像,但通用實時模型的性能明顯下降。在線註冊也具有挑戰性,因為攝像機視角傾斜和模態差異。在這項工作中,我們首先展示了頭像和頭戴攝像機圖像之間的領域差距是困難的主要來源之一,其中基於變壓器的架構在領域一致數據上實現了高準確性,但在重新引入領域差距時性能下降。基於這一發現,我們開發了一個系統設計,將問題分解為兩部分:1)一個接受領域內輸入的迭代細化模塊,和2)一個通用頭像引導的圖像對圖像風格轉換模塊,其條件是基於當前對表情和頭部姿勢的估計。這兩個模塊互相加強,因為當展示接近真實示例時,圖像風格轉換變得更容易,更好的領域差距消除有助於註冊。我們的系統高效地產生高質量結果,無需昂貴的離線註冊來生成個性化標籤。通過在一款廉價頭戴式設備上進行大量實驗,我們驗證了我們方法的準確性和效率,顯示與直接回歸方法以及離線註冊相比有顯著改進。
English
Virtual Reality (VR) bares promise of social interactions that can feel more
immersive than other media. Key to this is the ability to accurately animate a
photorealistic avatar of one's likeness while wearing a VR headset. Although
high quality registration of person-specific avatars to headset-mounted camera
(HMC) images is possible in an offline setting, the performance of generic
realtime models are significantly degraded. Online registration is also
challenging due to oblique camera views and differences in modality. In this
work, we first show that the domain gap between the avatar and headset-camera
images is one of the primary sources of difficulty, where a transformer-based
architecture achieves high accuracy on domain-consistent data, but degrades
when the domain-gap is re-introduced. Building on this finding, we develop a
system design that decouples the problem into two parts: 1) an iterative
refinement module that takes in-domain inputs, and 2) a generic avatar-guided
image-to-image style transfer module that is conditioned on current estimation
of expression and head pose. These two modules reinforce each other, as image
style transfer becomes easier when close-to-ground-truth examples are shown,
and better domain-gap removal helps registration. Our system produces
high-quality results efficiently, obviating the need for costly offline
registration to generate personalized labels. We validate the accuracy and
efficiency of our approach through extensive experiments on a commodity
headset, demonstrating significant improvements over direct regression methods
as well as offline registration.