Tele-Aloha:使用稀疏 RGB 相機的低預算高真實感遠程存在系統
Tele-Aloha: A Low-budget and High-authenticity Telepresence System Using Sparse RGB Cameras
May 23, 2024
作者: Hanzhang Tu, Ruizhi Shao, Xue Dong, Shunyuan Zheng, Hao Zhang, Lili Chen, Meili Wang, Wenyu Li, Siyan Ma, Shengping Zhang, Boyao Zhou, Yebin Liu
cs.AI
摘要
本文提出了一種低成本且高真實性的雙向遠程系統Tele-Aloha,針對點對點通信場景。與先前的系統相比,Tele-Aloha僅使用四個稀疏的RGB攝像頭、一個消費級GPU和一個自動立體顯示屏,實現高分辨率(2048x2048)、實時(30 fps)、低延遲(小於150毫秒)和穩健的遠程通信。作為Tele-Aloha的核心,我們提出了一種有效的新型視角合成算法,用於上半身。首先,我們設計了一個級聯視差估計器,用於獲得穩健的幾何線索。此外,通過高斯濺射引入了一個神經光柵化器,將潛在特徵投影到目標視角並將其解碼為降低的分辨率。此外,鑒於高質量的捕獲數據,我們利用加權混合機制將解碼後的圖像精煉為2K的最終分辨率。利用世界領先的自動立體顯示和低延遲的虹膜追踪,用戶即使沒有任何可穿戴的頭戴式顯示設備,也能體驗到強烈的立體感。總的來說,我們的遠程系統在現實實驗中展示了共存感,激發了下一代通信技術的靈感。
English
In this paper, we present a low-budget and high-authenticity bidirectional
telepresence system, Tele-Aloha, targeting peer-to-peer communication
scenarios. Compared to previous systems, Tele-Aloha utilizes only four sparse
RGB cameras, one consumer-grade GPU, and one autostereoscopic screen to achieve
high-resolution (2048x2048), real-time (30 fps), low-latency (less than 150ms)
and robust distant communication. As the core of Tele-Aloha, we propose an
efficient novel view synthesis algorithm for upper-body. Firstly, we design a
cascaded disparity estimator for obtaining a robust geometry cue. Additionally
a neural rasterizer via Gaussian Splatting is introduced to project latent
features onto target view and to decode them into a reduced resolution.
Further, given the high-quality captured data, we leverage weighted blending
mechanism to refine the decoded image into the final resolution of 2K.
Exploiting world-leading autostereoscopic display and low-latency iris
tracking, users are able to experience a strong three-dimensional sense even
without any wearable head-mounted display device. Altogether, our telepresence
system demonstrates the sense of co-presence in real-life experiments,
inspiring the next generation of communication.