Tele-Aloha:一种使用稀疏RGB摄像头的低成本高真实感远程存在系统
Tele-Aloha: A Low-budget and High-authenticity Telepresence System Using Sparse RGB Cameras
May 23, 2024
作者: Hanzhang Tu, Ruizhi Shao, Xue Dong, Shunyuan Zheng, Hao Zhang, Lili Chen, Meili Wang, Wenyu Li, Siyan Ma, Shengping Zhang, Boyao Zhou, Yebin Liu
cs.AI
摘要
本文介绍了一种低成本、高真实性的双向远程呈现系统Tele-Aloha,旨在应用于点对点通信场景。与先前的系统相比,Tele-Aloha仅利用四个稀疏的RGB摄像头、一个消费级GPU和一个自动立体屏幕,实现了高分辨率(2048x2048)、实时性(30 fps)、低延迟(小于150毫秒)和稳健的远程通信。作为Tele-Aloha的核心,我们提出了一种高效的新颖视角合成算法,用于上半身。首先,我们设计了一个级联视差估计器,用于获取稳健的几何线索。此外,引入了通过高斯飞溅实现的神经光栅化器,用于将潜在特征投影到目标视角并将其解码为降低分辨率。此外,鉴于高质量的捕获数据,我们利用加权混合机制将解码图像精炼到2K的最终分辨率。利用世界领先的自动立体显示和低延迟的虹膜跟踪,用户能够体验到强烈的三维感,即使没有任何可穿戴的头戴显示设备。总的来说,我们的远程呈现系统在真实实验中展示了共存感,激发了下一代通信技术的发展。
English
In this paper, we present a low-budget and high-authenticity bidirectional
telepresence system, Tele-Aloha, targeting peer-to-peer communication
scenarios. Compared to previous systems, Tele-Aloha utilizes only four sparse
RGB cameras, one consumer-grade GPU, and one autostereoscopic screen to achieve
high-resolution (2048x2048), real-time (30 fps), low-latency (less than 150ms)
and robust distant communication. As the core of Tele-Aloha, we propose an
efficient novel view synthesis algorithm for upper-body. Firstly, we design a
cascaded disparity estimator for obtaining a robust geometry cue. Additionally
a neural rasterizer via Gaussian Splatting is introduced to project latent
features onto target view and to decode them into a reduced resolution.
Further, given the high-quality captured data, we leverage weighted blending
mechanism to refine the decoded image into the final resolution of 2K.
Exploiting world-leading autostereoscopic display and low-latency iris
tracking, users are able to experience a strong three-dimensional sense even
without any wearable head-mounted display device. Altogether, our telepresence
system demonstrates the sense of co-presence in real-life experiments,
inspiring the next generation of communication.