使用大型重建模型进行单视角3D人体数字化

摘要

本文介绍了Human-LRM，这是一个单阶段前馈大型重建模型，旨在从单个图像中预测人类神经辐射场（NeRF）。我们的方法展示了在使用包含3D扫描和多视角捕获的大量数据集进行训练时的显着适应性。此外，为了增强模型在野外场景中的适用性，特别是在存在遮挡的情况下，我们提出了一种新颖的策略，通过条件三平面扩散模型将多视角重建转化为单视角。这种生成性扩展解决了从单个视角观察时人体形状固有的变化，并使得能够从被遮挡的图像中重建完整的人体。通过大量实验，我们展示了Human-LRM在几个基准测试中明显优于先前的方法。

English

In this paper, we introduce Human-LRM, a single-stage feed-forward Large Reconstruction Model designed to predict human Neural Radiance Fields (NeRF) from a single image. Our approach demonstrates remarkable adaptability in training using extensive datasets containing 3D scans and multi-view capture. Furthermore, to enhance the model's applicability for in-the-wild scenarios especially with occlusions, we propose a novel strategy that distills multi-view reconstruction into single-view via a conditional triplane diffusion model. This generative extension addresses the inherent variations in human body shapes when observed from a single view, and makes it possible to reconstruct the full body human from an occluded image. Through extensive experiments, we show that Human-LRM surpasses previous methods by a significant margin on several benchmarks.

使用大型重建模型进行单视角3D人体数字化

Single-View 3D Human Digitalization with Large Reconstruction Models

摘要

Support