ChatPaper.aiChatPaper

SAM 3D人体:鲁棒性全身人体网格重建

SAM 3D Body: Robust Full-Body Human Mesh Recovery

February 17, 2026
作者: Xitong Yang, Devansh Kukreja, Don Pinkus, Anushka Sagar, Taosha Fan, Jinhyung Park, Soyong Shin, Jinkun Cao, Jiawei Liu, Nicolas Ugrinovic, Matt Feiszli, Jitendra Malik, Piotr Dollar, Kris Kitani
cs.AI

摘要

我们推出SAM 3D人体模型(3DB),这是一个可提示的单图像全身三维人体网格重建(HMR)模型,在多样化真实场景中展现出最先进的性能、强大的泛化能力与稳定的精确度。3DB可同步估测人体躯干、足部与手部的三维姿态,是首个采用新型参数化网格表征——动量人体骨骼系统(MHR)的模型,该系统实现了骨骼结构与表面形态的解耦。3DB采用编码器-解码器架构,支持包括二维关键点与掩码在内的辅助提示,使用户能像操作SAM系列模型一样进行引导式推理。我们通过融合人工关键点标注、可微分优化、多视角几何与密集关键点检测的多阶段标注流程,获得了高质量标注数据。我们的数据引擎能高效筛选处理数据以确保多样性,特别收录非常规姿态与罕见成像条件。我们还提出了按姿态与外观分类的新型评估数据集,助力模型行为的精细化分析。实验表明,无论是在定性用户偏好研究还是传统定量分析中,本方法均优于现有技术,展现出卓越的泛化能力与显著提升。3DB与MHR均已开源。
English
We introduce SAM 3D Body (3DB), a promptable model for single-image full-body 3D human mesh recovery (HMR) that demonstrates state-of-the-art performance, with strong generalization and consistent accuracy in diverse in-the-wild conditions. 3DB estimates the human pose of the body, feet, and hands. It is the first model to use a new parametric mesh representation, Momentum Human Rig (MHR), which decouples skeletal structure and surface shape. 3DB employs an encoder-decoder architecture and supports auxiliary prompts, including 2D keypoints and masks, enabling user-guided inference similar to the SAM family of models. We derive high-quality annotations from a multi-stage annotation pipeline that uses various combinations of manual keypoint annotation, differentiable optimization, multi-view geometry, and dense keypoint detection. Our data engine efficiently selects and processes data to ensure data diversity, collecting unusual poses and rare imaging conditions. We present a new evaluation dataset organized by pose and appearance categories, enabling nuanced analysis of model behavior. Our experiments demonstrate superior generalization and substantial improvements over prior methods in both qualitative user preference studies and traditional quantitative analysis. Both 3DB and MHR are open-source.
PDF111March 28, 2026