快速注册逼真化身以用于虚拟现实面部动画
Fast Registration of Photorealistic Avatars for VR Facial Animation
January 19, 2024
作者: Chaitanya Patel, Shaojie Bai, Te-Li Wang, Jason Saragih, Shih-En Wei
cs.AI
摘要
虚拟现实(VR)展现了社交互动的潜力,可能比其他媒体更具沉浸感。其中关键是能够在佩戴VR头显时准确地为用户创建一个逼真的化身形象。虽然在离线环境中可以实现高质量的将特定于个人的化身形象与头显摄像头(HMC)图像进行注册,但通用实时模型的性能明显下降。在线注册也具有挑战性,因为存在摄像头视角的倾斜和模态差异。在这项工作中,我们首先展示了化身形象与头显摄像头图像之间的领域差距是困难的主要原因之一,基于变换器的架构在领域一致数据上实现了高准确性,但在重新引入领域差距时性能下降。基于这一发现,我们开发了一个系统设计,将问题分解为两部分:1)一个迭代细化模块,接受领域内输入,2)一个通用化身引导的图像风格转换模块,以当前表情和头部姿势估计为条件。这两个模块相互强化,当展示接近真实示例时,图像风格转换变得更容易,更好地消除领域差距有助于注册。我们的系统高效地产生高质量结果,无需昂贵的离线注册即可生成个性化标签。通过在一款普通头显上进行大量实验,我们验证了我们方法的准确性和效率,展示了与直接回归方法以及离线注册相比的显著改进。
English
Virtual Reality (VR) bares promise of social interactions that can feel more
immersive than other media. Key to this is the ability to accurately animate a
photorealistic avatar of one's likeness while wearing a VR headset. Although
high quality registration of person-specific avatars to headset-mounted camera
(HMC) images is possible in an offline setting, the performance of generic
realtime models are significantly degraded. Online registration is also
challenging due to oblique camera views and differences in modality. In this
work, we first show that the domain gap between the avatar and headset-camera
images is one of the primary sources of difficulty, where a transformer-based
architecture achieves high accuracy on domain-consistent data, but degrades
when the domain-gap is re-introduced. Building on this finding, we develop a
system design that decouples the problem into two parts: 1) an iterative
refinement module that takes in-domain inputs, and 2) a generic avatar-guided
image-to-image style transfer module that is conditioned on current estimation
of expression and head pose. These two modules reinforce each other, as image
style transfer becomes easier when close-to-ground-truth examples are shown,
and better domain-gap removal helps registration. Our system produces
high-quality results efficiently, obviating the need for costly offline
registration to generate personalized labels. We validate the accuracy and
efficiency of our approach through extensive experiments on a commodity
headset, demonstrating significant improvements over direct regression methods
as well as offline registration.