使用大型重建模型進行單視角3D人體數位化

摘要

本文介紹了Human-LRM，這是一個單階段前饋式大型重建模型，旨在從單張圖像預測人類神經輻射場（NeRF）。我們的方法展示了在使用包含3D掃描和多視角捕獲的大量數據集進行訓練時的顯著適應性。此外，為了增強模型在野外場景中的應用性，特別是在存在遮擋情況下，我們提出了一種新穎的策略，通過條件三面擴散模型將多視角重建轉化為單視角。這種生成性擴展解決了從單視角觀察時人體形狀的固有變化，並使得可以從被遮擋的圖像中重建完整的人體。通過大量實驗，我們展示了Human-LRM在幾個基準測試中明顯優於先前方法。

English

In this paper, we introduce Human-LRM, a single-stage feed-forward Large Reconstruction Model designed to predict human Neural Radiance Fields (NeRF) from a single image. Our approach demonstrates remarkable adaptability in training using extensive datasets containing 3D scans and multi-view capture. Furthermore, to enhance the model's applicability for in-the-wild scenarios especially with occlusions, we propose a novel strategy that distills multi-view reconstruction into single-view via a conditional triplane diffusion model. This generative extension addresses the inherent variations in human body shapes when observed from a single view, and makes it possible to reconstruct the full body human from an occluded image. Through extensive experiments, we show that Human-LRM surpasses previous methods by a significant margin on several benchmarks.

使用大型重建模型進行單視角3D人體數位化

Single-View 3D Human Digitalization with Large Reconstruction Models

摘要

Support