ChatPaper.aiChatPaper

使用大型重建模型進行單視角3D人體數位化

Single-View 3D Human Digitalization with Large Reconstruction Models

January 22, 2024
作者: Zhenzhen Weng, Jingyuan Liu, Hao Tan, Zhan Xu, Yang Zhou, Serena Yeung-Levy, Jimei Yang
cs.AI

摘要

本文介紹了Human-LRM,這是一個單階段前饋式大型重建模型,旨在從單張圖像預測人類神經輻射場(NeRF)。我們的方法展示了在使用包含3D掃描和多視角捕獲的大量數據集進行訓練時的顯著適應性。此外,為了增強模型在野外場景中的應用性,特別是在存在遮擋情況下,我們提出了一種新穎的策略,通過條件三面擴散模型將多視角重建轉化為單視角。這種生成性擴展解決了從單視角觀察時人體形狀的固有變化,並使得可以從被遮擋的圖像中重建完整的人體。通過大量實驗,我們展示了Human-LRM在幾個基準測試中明顯優於先前方法。
English
In this paper, we introduce Human-LRM, a single-stage feed-forward Large Reconstruction Model designed to predict human Neural Radiance Fields (NeRF) from a single image. Our approach demonstrates remarkable adaptability in training using extensive datasets containing 3D scans and multi-view capture. Furthermore, to enhance the model's applicability for in-the-wild scenarios especially with occlusions, we propose a novel strategy that distills multi-view reconstruction into single-view via a conditional triplane diffusion model. This generative extension addresses the inherent variations in human body shapes when observed from a single view, and makes it possible to reconstruct the full body human from an occluded image. Through extensive experiments, we show that Human-LRM surpasses previous methods by a significant margin on several benchmarks.
PDF61December 15, 2024