DressRecon:来自单目视频的自由形式4D人体重建
DressRecon: Freeform 4D Human Reconstruction from Monocular Video
September 30, 2024
作者: Jeff Tan, Donglai Xiang, Shubham Tulsiani, Deva Ramanan, Gengshan Yang
cs.AI
摘要
我们提出了一种从单目视频中重建时间一致的人体模型的方法,重点关注极松散的服装或手持物体的互动。先前的人体重建工作要么局限于紧身服装且没有物体互动,要么需要校准的多视图捕捉或个性化模板扫描,这在大规模收集时成本高昂。我们实现高质量且灵活重建的关键见解是将关于关节身体形状的通用先验(从大规模训练数据中学习)与视频特定的关节“骨袋”变形(通过测试时优化适应单个视频)进行精心组合。我们通过学习一个神经隐式模型来实现这一点,该模型将身体与服装变形解开为单独的运动模型层。为了捕捉服装的微妙几何特征,我们在优化过程中利用基于图像的先验,如人体姿势、表面法线和光流。生成的神经场可以提取为时间一致的网格,或进一步优化为明确的三维高斯函数,用于高保真交互式渲染。在具有极具挑战性的服装变形和物体互动的数据集上,DressRecon比现有技术产生了更高保真度的3D重建。项目页面:https://jefftan969.github.io/dressrecon/
English
We present a method to reconstruct time-consistent human body models from
monocular videos, focusing on extremely loose clothing or handheld object
interactions. Prior work in human reconstruction is either limited to tight
clothing with no object interactions, or requires calibrated multi-view
captures or personalized template scans which are costly to collect at scale.
Our key insight for high-quality yet flexible reconstruction is the careful
combination of generic human priors about articulated body shape (learned from
large-scale training data) with video-specific articulated "bag-of-bones"
deformation (fit to a single video via test-time optimization). We accomplish
this by learning a neural implicit model that disentangles body versus clothing
deformations as separate motion model layers. To capture subtle geometry of
clothing, we leverage image-based priors such as human body pose, surface
normals, and optical flow during optimization. The resulting neural fields can
be extracted into time-consistent meshes, or further optimized as explicit 3D
Gaussians for high-fidelity interactive rendering. On datasets with highly
challenging clothing deformations and object interactions, DressRecon yields
higher-fidelity 3D reconstructions than prior art. Project page:
https://jefftan969.github.io/dressrecon/Summary
AI-Generated Summary