Real-time monoscopische full-body capture in wereldruimte via sequentieel proxy-naar-beweging leren

Samenvatting

Leer-gebaseerde benaderingen voor monokulaire motion capture hebben recent veelbelovende resultaten laten zien door op een data-gedreven manier te leren regresseren. Echter, vanwege de uitdagingen in dataverzameling en netwerkontwerpen, blijft het voor bestaande oplossingen een uitdaging om real-time full-body capture te bereiken terwijl ze nauwkeurig zijn in de wereldruimte. In dit werk dragen we een sequentieel proxy-naar-motion leer schema bij, samen met een proxy dataset van 2D skeletsequenties en 3D rotatiebewegingen in de wereldruimte. Dergelijke proxy data stelt ons in staat om een leer-gebaseerd netwerk te bouwen met nauwkeurige full-body supervisie, terwijl het ook de generalisatieproblemen vermindert. Voor nauwkeurigere en fysiek plausibele voorspellingen wordt een contact-bewuste neurale motion descent module voorgesteld in ons netwerk, zodat het zich bewust kan zijn van voet-grond contact en bewegingen die niet overeenkomen met de proxy observaties. Daarnaast delen we de lichaam-hand context informatie in ons netwerk voor een meer compatibel herstel van polsposes met het full-body model. Met de voorgestelde leer-gebaseerde oplossing demonstreren we het eerste real-time monokulaire full-body capture systeem met plausibel voet-grond contact in de wereldruimte. Meer videoresultaten zijn te vinden op onze projectpagina: https://liuyebin.com/proxycap.

English

Learning-based approaches to monocular motion capture have recently shown promising results by learning to regress in a data-driven manner. However, due to the challenges in data collection and network designs, it remains challenging for existing solutions to achieve real-time full-body capture while being accurate in world space. In this work, we contribute a sequential proxy-to-motion learning scheme together with a proxy dataset of 2D skeleton sequences and 3D rotational motions in world space. Such proxy data enables us to build a learning-based network with accurate full-body supervision while also mitigating the generalization issues. For more accurate and physically plausible predictions, a contact-aware neural motion descent module is proposed in our network so that it can be aware of foot-ground contact and motion misalignment with the proxy observations. Additionally, we share the body-hand context information in our network for more compatible wrist poses recovery with the full-body model. With the proposed learning-based solution, we demonstrate the first real-time monocular full-body capture system with plausible foot-ground contact in world space. More video results can be found at our project page: https://liuyebin.com/proxycap.

Real-time monoscopische full-body capture in wereldruimte via sequentieel proxy-naar-beweging leren

Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning

Samenvatting

Support