ChatPaper.aiChatPaper

V^3:通过可流式传输的2D动态高斯函数在移动设备上观看体积视频

V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians

September 20, 2024
作者: Penghao Wang, Zhirui Zhang, Liao Wang, Kaixin Yao, Siyuan Xie, Jingyi Yu, Minye Wu, Lan Xu
cs.AI

摘要

体验高保真体积视频与2D视频一样流畅一直是一个梦想。然而,当前的动态3DGS方法,尽管具有高渲染质量,但由于计算和带宽限制,在移动设备上面临流媒体挑战。在本文中,我们介绍了V3(查看体积视频),这是一种通过动态高斯流媒体实现高质量移动渲染的新方法。我们的关键创新是将动态3DGS视为2D视频,从而便于使用硬件视频编解码器。此外,我们提出了一个两阶段训练策略,通过快速训练速度减少存储需求。第一阶段采用哈希编码和浅层MLP来学习运动,然后通过修剪减少高斯数量以满足流媒体要求,而第二阶段则利用残差熵损失和时间损失微调其他高斯属性以改善时间连续性。这种策略,将运动和外观分离,保持了高渲染质量并具有紧凑的存储需求。同时,我们设计了一个多平台播放器来解码和渲染2D高斯视频。大量实验证明了V3的有效性,通过在普通设备上实现高质量渲染和流媒体,胜过其他方法,这是前所未有的。作为首个在移动设备上流式传输动态高斯的项目,我们的伴侣播放器为用户提供了前所未有的体积视频体验,包括流畅滚动和即时共享。我们的项目页面和源代码可在https://authoritywang.github.io/v3/上找到。
English
Experiencing high-fidelity volumetric video as seamlessly as 2D videos is a long-held dream. However, current dynamic 3DGS methods, despite their high rendering quality, face challenges in streaming on mobile devices due to computational and bandwidth constraints. In this paper, we introduce V3(Viewing Volumetric Videos), a novel approach that enables high-quality mobile rendering through the streaming of dynamic Gaussians. Our key innovation is to view dynamic 3DGS as 2D videos, facilitating the use of hardware video codecs. Additionally, we propose a two-stage training strategy to reduce storage requirements with rapid training speed. The first stage employs hash encoding and shallow MLP to learn motion, then reduces the number of Gaussians through pruning to meet the streaming requirements, while the second stage fine tunes other Gaussian attributes using residual entropy loss and temporal loss to improve temporal continuity. This strategy, which disentangles motion and appearance, maintains high rendering quality with compact storage requirements. Meanwhile, we designed a multi-platform player to decode and render 2D Gaussian videos. Extensive experiments demonstrate the effectiveness of V3, outperforming other methods by enabling high-quality rendering and streaming on common devices, which is unseen before. As the first to stream dynamic Gaussians on mobile devices, our companion player offers users an unprecedented volumetric video experience, including smooth scrolling and instant sharing. Our project page with source code is available at https://authoritywang.github.io/v3/.

Summary

AI-Generated Summary

PDF122November 16, 2024