ATLAS:解耦骨骼與形狀參數以實現富有表現力的參數化人體建模
ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling
August 21, 2025
作者: Jinhyung Park, Javier Romero, Shunsuke Saito, Fabian Prada, Takaaki Shiratori, Yichen Xu, Federica Bogo, Shoou-I Yu, Kris Kitani, Rawal Khirodkar
cs.AI
摘要
參數化人體模型提供了跨多種姿勢、體型和面部表情的豐富三維人體表示,通常通過學習已註冊三維網格的基函數來實現。然而,現有的人體網格建模方法在捕捉多樣體姿和體型上的細微變化方面存在困難,這主要歸因於訓練數據多樣性的不足以及建模假設的限制。此外,常見的範式首先使用線性基函數優化外部體表,然後從表面頂點回歸內部骨骼關節。這種方法在內部骨架與外部軟組織之間引入了問題性的依賴關係,限制了對身高和骨骼長度的直接控制。為解決這些問題,我們提出了ATLAS,這是一個從240台同步相機捕捉的60萬張高分辨率掃描中學習到的高保真人體模型。與以往方法不同,我們通過將網格表示基於人體骨架,明確地解耦了形狀和骨架基函數。這種解耦增強了形狀的表達能力,實現了對身體屬性的細粒度定制,以及獨立於外部軟組織特徵的關鍵點擬合。ATLAS在擬合多樣姿勢下的未見主體時表現優於現有方法,定量評估顯示,與線性模型相比,我們非線性的姿勢校正更有效地捕捉了複雜的姿勢。
English
Parametric body models offer expressive 3D representation of humans across a
wide range of poses, shapes, and facial expressions, typically derived by
learning a basis over registered 3D meshes. However, existing human mesh
modeling approaches struggle to capture detailed variations across diverse body
poses and shapes, largely due to limited training data diversity and
restrictive modeling assumptions. Moreover, the common paradigm first optimizes
the external body surface using a linear basis, then regresses internal
skeletal joints from surface vertices. This approach introduces problematic
dependencies between internal skeleton and outer soft tissue, limiting direct
control over body height and bone lengths. To address these issues, we present
ATLAS, a high-fidelity body model learned from 600k high-resolution scans
captured using 240 synchronized cameras. Unlike previous methods, we explicitly
decouple the shape and skeleton bases by grounding our mesh representation in
the human skeleton. This decoupling enables enhanced shape expressivity,
fine-grained customization of body attributes, and keypoint fitting independent
of external soft-tissue characteristics. ATLAS outperforms existing methods by
fitting unseen subjects in diverse poses more accurately, and quantitative
evaluations show that our non-linear pose correctives more effectively capture
complex poses compared to linear models.