HumanRF：适用于运动中人类的高保真神经辐射场

摘要

在各种应用中，如电影制作、电脑游戏或视频会议中，准确表现人类表现是至关重要的基础。为了缩小与生产级质量之间的差距，我们引入了HumanRF，这是一种4D动态神经场景表示，可以从多视角视频输入中捕捉人体全身运动外观，并能够从新颖、未见过的视角进行回放。我们的新颖表示形式充当动态视频编码，通过将时空分解为时间矩阵-向量分解，以高压缩率捕捉精细细节。这使我们能够获得人类演员的时间上连贯重建，即使在具有挑战性运动背景下，也能表示高分辨率细节。虽然大多数研究集中在合成4MP或更低分辨率，我们解决了在12MP分辨率下操作的挑战。为此，我们引入了ActorsHQ，这是一个新颖的多视角数据集，提供了来自160台摄像机的16个序列的12MP镜头，具有高保真度、逐帧网格重建。我们展示了使用这种高分辨率数据出现的挑战，并展示了我们新引入的HumanRF如何有效利用这些数据，从而在生产级质量的新视角合成方面迈出了重要一步。

English

Representing human performance at high-fidelity is an essential building block in diverse applications, such as film production, computer games or videoconferencing. To close the gap to production-level quality, we introduce HumanRF, a 4D dynamic neural scene representation that captures full-body appearance in motion from multi-view video input, and enables playback from novel, unseen viewpoints. Our novel representation acts as a dynamic video encoding that captures fine details at high compression rates by factorizing space-time into a temporal matrix-vector decomposition. This allows us to obtain temporally coherent reconstructions of human actors for long sequences, while representing high-resolution details even in the context of challenging motion. While most research focuses on synthesizing at resolutions of 4MP or lower, we address the challenge of operating at 12MP. To this end, we introduce ActorsHQ, a novel multi-view dataset that provides 12MP footage from 160 cameras for 16 sequences with high-fidelity, per-frame mesh reconstructions. We demonstrate challenges that emerge from using such high-resolution data and show that our newly introduced HumanRF effectively leverages this data, making a significant step towards production-level quality novel view synthesis.

HumanRF：适用于运动中人类的高保真神经辐射场

HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion

摘要

Support