IMUSIC：基于IMU的面部表情捕捉

摘要

在面部动作捕捉和分析方面，主导的解决方案通常基于视觉线索，这些线索无法保护隐私且容易受到遮挡的影响。惯性测量单元（IMUs）作为潜在的解决方案，但主要用于全身动作捕捉。在本文中，我们提出了IMUSIC来填补这一空白，这是一种使用纯IMU信号进行面部表情捕捉的新途径，与以往的视觉解决方案有显著差异。我们的IMUSIC中的关键设计是三部曲。首先，我们设计微型IMUs以适应面部捕捉，配合解剖驱动的IMU放置方案。然后，我们贡献了一个新颖的IMU-ARKit数据集，为各种面部表情和表演提供丰富的配对IMU/视觉信号。这种独特的多模态性为未来方向带来了巨大潜力，比如基于IMU的面部行为分析。此外，利用IMU-ARKit，我们引入了一种强大的基准方法，可以准确地从纯IMU信号中预测面部混合形状参数。具体来说，我们为这一新型跟踪任务定制了一个具有两阶段训练策略的Transformer扩散模型。IMUSIC框架使我们能够在视觉方法失灵的情况下进行准确的面部捕捉，并同时保护用户隐私。我们进行了大量关于IMU配置和技术组件的实验证明了我们IMUSIC方法的有效性。值得注意的是，IMUSIC使得各种潜在和新颖的应用成为可能，例如保护隐私的面部捕捉、针对遮挡的混合捕捉，或者检测通常通过视觉线索难以察觉的微小面部运动。我们将发布我们的数据集和实现，以丰富社区中面部捕捉和分析的更多可能性。

English

For facial motion capture and analysis, the dominated solutions are generally based on visual cues, which cannot protect privacy and are vulnerable to occlusions. Inertial measurement units (IMUs) serve as potential rescues yet are mainly adopted for full-body motion capture. In this paper, we propose IMUSIC to fill the gap, a novel path for facial expression capture using purely IMU signals, significantly distant from previous visual solutions.The key design in our IMUSIC is a trilogy. We first design micro-IMUs to suit facial capture, companion with an anatomy-driven IMU placement scheme. Then, we contribute a novel IMU-ARKit dataset, which provides rich paired IMU/visual signals for diverse facial expressions and performances. Such unique multi-modality brings huge potential for future directions like IMU-based facial behavior analysis. Moreover, utilizing IMU-ARKit, we introduce a strong baseline approach to accurately predict facial blendshape parameters from purely IMU signals. Specifically, we tailor a Transformer diffusion model with a two-stage training strategy for this novel tracking task. The IMUSIC framework empowers us to perform accurate facial capture in scenarios where visual methods falter and simultaneously safeguard user privacy. We conduct extensive experiments about both the IMU configuration and technical components to validate the effectiveness of our IMUSIC approach. Notably, IMUSIC enables various potential and novel applications, i.e., privacy-protecting facial capture, hybrid capture against occlusions, or detecting minute facial movements that are often invisible through visual cues. We will release our dataset and implementations to enrich more possibilities of facial capture and analysis in our community.

IMUSIC：基于IMU的面部表情捕捉

IMUSIC: IMU-based Facial Expression Capture

摘要

Support