HumanRF:適用於活動中的高保真度神經輻射場
HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion
May 10, 2023
作者: Mustafa Işık, Martin Rünz, Markos Georgopoulos, Taras Khakhulin, Jonathan Starck, Lourdes Agapito, Matthias Nießner
cs.AI
摘要
在各種應用中,如電影製作、電腦遊戲或視訊會議,以高保真度呈現人類表現是一個重要的基礎。為了縮小與製作級別品質之間的差距,我們引入了HumanRF,一種4D動態神經場景表示,從多視角視訊輸入中捕捉全身外觀運動,並實現從新的、未見視角進行播放。我們的新型表示作為動態視訊編碼,通過將時空分解為時間矩陣-向量分解,以高壓縮率捕捉精細細節。這使我們能夠獲得長序列的人類演員的時間上連貫的重建,同時在具有挑戰性運動背景下呈現高解析度細節。儘管大多數研究集中在合成4MP或更低分辨率,我們解決了在12MP操作的挑戰。為此,我們引入了ActorsHQ,一個新型多視角數據集,提供來自160台攝像機的16個序列的12MP畫面,具有高保真度的逐幀網格重建。我們展示了使用這種高解析度數據所產生的挑戰,並展示我們新引入的HumanRF有效地利用這些數據,往製作級別品質的新視角合成邁出了重要一步。
English
Representing human performance at high-fidelity is an essential building
block in diverse applications, such as film production, computer games or
videoconferencing. To close the gap to production-level quality, we introduce
HumanRF, a 4D dynamic neural scene representation that captures full-body
appearance in motion from multi-view video input, and enables playback from
novel, unseen viewpoints. Our novel representation acts as a dynamic video
encoding that captures fine details at high compression rates by factorizing
space-time into a temporal matrix-vector decomposition. This allows us to
obtain temporally coherent reconstructions of human actors for long sequences,
while representing high-resolution details even in the context of challenging
motion. While most research focuses on synthesizing at resolutions of 4MP or
lower, we address the challenge of operating at 12MP. To this end, we introduce
ActorsHQ, a novel multi-view dataset that provides 12MP footage from 160
cameras for 16 sequences with high-fidelity, per-frame mesh reconstructions. We
demonstrate challenges that emerge from using such high-resolution data and
show that our newly introduced HumanRF effectively leverages this data, making
a significant step towards production-level quality novel view synthesis.