通过顺序代理到动作学习在世界空间中实现的实时单目全身捕捉
Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning
July 3, 2023
作者: Yuxiang Zhang, Hongwen Zhang, Liangxiao Hu, Hongwei Yi, Shengping Zhang, Yebin Liu
cs.AI
摘要
最近,基于学习的单目动作捕捉方法通过学习以数据驱动的方式进行回归,展现出了令人期待的结果。然而,由于数据收集和网络设计方面的挑战,现有解决方案仍然难以实现在世界空间准确实时的全身捕捉。在这项工作中,我们提出了一种顺序的代理到动作学习方案,结合了一个包含2D骨架序列和世界空间中的3D旋转动作的代理数据集。这种代理数据使我们能够构建一个基于学习的网络,具有准确的全身监督,同时也缓解了泛化问题。为了更准确和物理上合理的预测,我们在网络中提出了一个考虑接触的神经运动下降模块,使其能够意识到脚地接触和与代理观察的运动错位。此外,我们在网络中共享身体-手部上下文信息,以更好地恢复与全身模型兼容的手腕姿势。通过提出的基于学习的解决方案,我们展示了首个具有世界空间中合理脚地接触的实时单目全身捕捉系统。更多视频结果可在我们的项目页面找到:https://liuyebin.com/proxycap。
English
Learning-based approaches to monocular motion capture have recently shown
promising results by learning to regress in a data-driven manner. However, due
to the challenges in data collection and network designs, it remains
challenging for existing solutions to achieve real-time full-body capture while
being accurate in world space. In this work, we contribute a sequential
proxy-to-motion learning scheme together with a proxy dataset of 2D skeleton
sequences and 3D rotational motions in world space. Such proxy data enables us
to build a learning-based network with accurate full-body supervision while
also mitigating the generalization issues. For more accurate and physically
plausible predictions, a contact-aware neural motion descent module is proposed
in our network so that it can be aware of foot-ground contact and motion
misalignment with the proxy observations. Additionally, we share the body-hand
context information in our network for more compatible wrist poses recovery
with the full-body model. With the proposed learning-based solution, we
demonstrate the first real-time monocular full-body capture system with
plausible foot-ground contact in world space. More video results can be found
at our project page: https://liuyebin.com/proxycap.