ChatPaper.aiChatPaper

3DGStream:即時訓練3D高斯模型以實現高效串流逼真自由視角影片

3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos

March 3, 2024
作者: Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, Wei Xing
cs.AI

摘要

從多視角影片中構建動態場景的照片逼真自由視點視頻(FVV)仍然是一項具有挑戰性的工作。儘管當前神經渲染技術取得了顯著進展,這些方法通常需要完整的視頻序列進行離線訓練,並且無法進行實時渲染。為了應對這些限制,我們引入了3DGStream,這是一種專為實現高效的現實世界動態場景FVV串流而設計的方法。我們的方法實現了在12秒內快速的即時逐幀重建,以及每秒200幀的實時渲染。具體來說,我們利用3D高斯(3DGs)來表示場景。我們不採用直接優化每幀3DGs的天真方法,而是使用緊湊的神經轉換緩存(NTC)來模擬3DGs的平移和旋轉,顯著減少了每個FVV幀所需的訓練時間和存儲空間。此外,我們提出了一種適應性3DG添加策略來處理動態場景中出現的新對象。實驗表明,與最先進的方法相比,3DGStream在渲染速度、圖像質量、訓練時間和模型存儲方面實現了競爭性的性能。
English
Constructing photo-realistic Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos remains a challenging endeavor. Despite the remarkable advancements achieved by current neural rendering techniques, these methods generally require complete video sequences for offline training and are not capable of real-time rendering. To address these constraints, we introduce 3DGStream, a method designed for efficient FVV streaming of real-world dynamic scenes. Our method achieves fast on-the-fly per-frame reconstruction within 12 seconds and real-time rendering at 200 FPS. Specifically, we utilize 3D Gaussians (3DGs) to represent the scene. Instead of the na\"ive approach of directly optimizing 3DGs per-frame, we employ a compact Neural Transformation Cache (NTC) to model the translations and rotations of 3DGs, markedly reducing the training time and storage required for each FVV frame. Furthermore, we propose an adaptive 3DG addition strategy to handle emerging objects in dynamic scenes. Experiments demonstrate that 3DGStream achieves competitive performance in terms of rendering speed, image quality, training time, and model storage when compared with state-of-the-art methods.
PDF60December 15, 2024