ChatPaper.aiChatPaper

3DGStream:用于高效流式传输的实时训练3D高斯模型,用于逼真自由视点视频

3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos

March 3, 2024
作者: Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, Wei Xing
cs.AI

摘要

从多视角视频构建动态场景的逼真自由视点视频(FVV)仍然是一项具有挑战性的工作。尽管当前神经渲染技术取得了显著进展,但这些方法通常需要完整的视频序列进行离线训练,并且无法实时渲染。为了解决这些限制,我们引入了3DGStream,这是一种专为实现实时流式传输真实世界动态场景的方法。我们的方法实现了在12秒内快速的即时逐帧重建,并以200 FPS的实时渲染速度。具体来说,我们利用3D高斯(3DG)来表示场景。我们不采用直接针对每帧优化3DG的朴素方法,而是采用紧凑的神经变换缓存(NTC)来建模3DG的平移和旋转,显著减少了每个FVV帧所需的训练时间和存储空间。此外,我们提出了一种自适应的3DG添加策略来处理动态场景中出现的新物体。实验证明,与最先进的方法相比,3DGStream在渲染速度、图像质量、训练时间和模型存储方面实现了竞争性能。
English
Constructing photo-realistic Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos remains a challenging endeavor. Despite the remarkable advancements achieved by current neural rendering techniques, these methods generally require complete video sequences for offline training and are not capable of real-time rendering. To address these constraints, we introduce 3DGStream, a method designed for efficient FVV streaming of real-world dynamic scenes. Our method achieves fast on-the-fly per-frame reconstruction within 12 seconds and real-time rendering at 200 FPS. Specifically, we utilize 3D Gaussians (3DGs) to represent the scene. Instead of the na\"ive approach of directly optimizing 3DGs per-frame, we employ a compact Neural Transformation Cache (NTC) to model the translations and rotations of 3DGs, markedly reducing the training time and storage required for each FVV frame. Furthermore, we propose an adaptive 3DG addition strategy to handle emerging objects in dynamic scenes. Experiments demonstrate that 3DGStream achieves competitive performance in terms of rendering speed, image quality, training time, and model storage when compared with state-of-the-art methods.
PDF60December 15, 2024