3DGStream:用于高效流式传输的实时训练3D高斯模型,用于逼真自由视点视频
3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos
March 3, 2024
作者: Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, Wei Xing
cs.AI
摘要
从多视角视频构建动态场景的逼真自由视点视频(FVV)仍然是一项具有挑战性的工作。尽管当前神经渲染技术取得了显著进展,但这些方法通常需要完整的视频序列进行离线训练,并且无法实时渲染。为了解决这些限制,我们引入了3DGStream,这是一种专为实现实时流式传输真实世界动态场景的方法。我们的方法实现了在12秒内快速的即时逐帧重建,并以200 FPS的实时渲染速度。具体来说,我们利用3D高斯(3DG)来表示场景。我们不采用直接针对每帧优化3DG的朴素方法,而是采用紧凑的神经变换缓存(NTC)来建模3DG的平移和旋转,显著减少了每个FVV帧所需的训练时间和存储空间。此外,我们提出了一种自适应的3DG添加策略来处理动态场景中出现的新物体。实验证明,与最先进的方法相比,3DGStream在渲染速度、图像质量、训练时间和模型存储方面实现了竞争性能。
English
Constructing photo-realistic Free-Viewpoint Videos (FVVs) of dynamic scenes
from multi-view videos remains a challenging endeavor. Despite the remarkable
advancements achieved by current neural rendering techniques, these methods
generally require complete video sequences for offline training and are not
capable of real-time rendering. To address these constraints, we introduce
3DGStream, a method designed for efficient FVV streaming of real-world dynamic
scenes. Our method achieves fast on-the-fly per-frame reconstruction within 12
seconds and real-time rendering at 200 FPS. Specifically, we utilize 3D
Gaussians (3DGs) to represent the scene. Instead of the na\"ive approach of
directly optimizing 3DGs per-frame, we employ a compact Neural Transformation
Cache (NTC) to model the translations and rotations of 3DGs, markedly reducing
the training time and storage required for each FVV frame. Furthermore, we
propose an adaptive 3DG addition strategy to handle emerging objects in dynamic
scenes. Experiments demonstrate that 3DGStream achieves competitive performance
in terms of rendering speed, image quality, training time, and model storage
when compared with state-of-the-art methods.