ChatPaper.aiChatPaper

HumanMM:基於多鏡頭影片的全局人體運動重建

HumanMM: Global Human Motion Recovery from Multi-shot Videos

March 10, 2025
作者: Yuhong Zhang, Guanlin Wu, Ling-Hao Chen, Zhuokai Zhao, Jing Lin, Xiaoke Jiang, Jiamin Wu, Zhuoheng Li, Hao Frank Yang, Haoqian Wang, Lei Zhang
cs.AI

摘要

本文提出了一種新穎的框架,旨在從包含多個鏡頭轉換的野外視頻中重建世界坐標系下的長序列三維人體運動。這類長序列的野外運動對於動作生成和動作理解等應用具有極高價值,但由於此類視頻中存在的突然鏡頭轉換、部分遮擋和動態背景,其重建面臨巨大挑戰。現有方法主要集中於單一鏡頭視頻,其中連續性在單一攝像機視角內得以保持,或僅在攝像機空間簡化多鏡頭對齊。在本研究中,我們通過整合增強型攝像機姿態估計與人體運動恢復(HMR),並引入鏡頭轉換檢測器和魯棒的對齊模塊,來應對這些挑戰,從而實現跨鏡頭的準確姿態和方向連續性。通過利用定制的運動積分器,我們有效緩解了腳部滑動問題,並確保了人體姿態的時間一致性。在我們從公開的三維人體數據集創建的多鏡頭數據集上進行的廣泛評估,展示了我們方法在世界坐標系下重建真實人體運動的魯棒性。
English
In this paper, we present a novel framework designed to reconstruct long-sequence 3D human motion in the world coordinates from in-the-wild videos with multiple shot transitions. Such long-sequence in-the-wild motions are highly valuable to applications such as motion generation and motion understanding, but are of great challenge to be recovered due to abrupt shot transitions, partial occlusions, and dynamic backgrounds presented in such videos. Existing methods primarily focus on single-shot videos, where continuity is maintained within a single camera view, or simplify multi-shot alignment in camera space only. In this work, we tackle the challenges by integrating an enhanced camera pose estimation with Human Motion Recovery (HMR) by incorporating a shot transition detector and a robust alignment module for accurate pose and orientation continuity across shots. By leveraging a custom motion integrator, we effectively mitigate the problem of foot sliding and ensure temporal consistency in human pose. Extensive evaluations on our created multi-shot dataset from public 3D human datasets demonstrate the robustness of our method in reconstructing realistic human motion in world coordinates.

Summary

AI-Generated Summary

PDF21March 11, 2025