ChatPaper.aiChatPaper

SAM-Body4D:基于视频的无训练四维人体网格重建系统

SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos

December 9, 2025
作者: Mingqi Gao, Yunqi Miao, Jungong Han
cs.AI

摘要

人体网格重建(HMR)技术旨在从二维观测数据中恢复三维人体姿态与形状,是现实场景中以人为本的视觉理解基础。尽管当前基于图像的HMR方法(如SAM 3D Body)在野外图像上展现出强大鲁棒性,但在处理视频时依赖逐帧推理,会导致时间连续性缺失且在遮挡情况下性能下降。我们通过利用视频中人体运动的固有连续性,在不增加训练成本的前提下解决了这些问题。本文提出SAM-Body4D——一个无需训练即可从视频中实现时序一致且抗遮挡的HMR框架。我们首先通过可提示视频分割模型生成身份一致的掩码片段,继而利用遮挡感知模块修复缺失区域。优化后的掩码片段引导SAM 3D Body生成连贯的全身体网格轨迹,而基于填充的并行化策略则实现了高效的多人体推理。实验结果表明,SAM-Body4D在具有挑战性的野外视频中显著提升了时间稳定性和鲁棒性,且无需任何重新训练。代码与演示见:https://github.com/gaomingqi/sam-body4d。
English
Human Mesh Recovery (HMR) aims to reconstruct 3D human pose and shape from 2D observations and is fundamental to human-centric understanding in real-world scenarios. While recent image-based HMR methods such as SAM 3D Body achieve strong robustness on in-the-wild images, they rely on per-frame inference when applied to videos, leading to temporal inconsistency and degraded performance under occlusions. We address these issues without extra training by leveraging the inherent human continuity in videos. We propose SAM-Body4D, a training-free framework for temporally consistent and occlusion-robust HMR from videos. We first generate identity-consistent masklets using a promptable video segmentation model, then refine them with an Occlusion-Aware module to recover missing regions. The refined masklets guide SAM 3D Body to produce consistent full-body mesh trajectories, while a padding-based parallel strategy enables efficient multi-human inference. Experimental results demonstrate that SAM-Body4D achieves improved temporal stability and robustness in challenging in-the-wild videos, without any retraining. Our code and demo are available at: https://github.com/gaomingqi/sam-body4d.
PDF12December 11, 2025