ChatPaper.aiChatPaper

LangSplatV2:高維度3D語言高斯潑濺技術,實現450+ FPS

LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

July 9, 2025
作者: Wanhua Li, Yujie Zhao, Minghan Qin, Yang Liu, Yuanhao Cai, Chuang Gan, Hanspeter Pfister
cs.AI

摘要

本文介紹了LangSplatV2,該系統在高解析度影像上實現了476.2 FPS的高維特徵噴射和384.6 FPS的3D開放詞彙文本查詢,分別提供了42倍的速度提升和47倍的性能提升,同時提高了查詢準確性。LangSplat採用高斯噴射技術將2D CLIP語言特徵嵌入3D空間,顯著提升了速度並學習了精確的3D語言場,結合了SAM語義。這些3D語言場的進步對於需要在複雜場景中進行語言交互的應用至關重要。然而,即使使用先進的A100 GPU,LangSplat仍未實現實時推理性能(8.2 FPS),這嚴重限制了其廣泛應用。在本文中,我們首先對LangSplat進行了詳細的時間分析,發現重量級解碼器是主要的速度瓶頸。我們的解決方案LangSplatV2假設每個高斯在全局字典中充當稀疏編碼,從而學習了一個3D稀疏係數場,完全消除了對重量級解碼器的需求。通過利用這種稀疏性,我們進一步提出了一種高效的稀疏係數噴射方法,並進行了CUDA優化,在僅需噴射超低維特徵的時間成本下,渲染出高質量的高維特徵圖。我們的實驗結果表明,LangSplatV2不僅在查詢準確性上表現更好或具有競爭力,而且速度顯著提升。代碼和演示可在我們的項目頁面獲取:https://langsplat-v2.github.io。
English
In this paper, we introduce LangSplatV2, which achieves high-dimensional feature splatting at 476.2 FPS and 3D open-vocabulary text querying at 384.6 FPS for high-resolution images, providing a 42 times speedup and a 47 times boost over LangSplat respectively, along with improved query accuracy. LangSplat employs Gaussian Splatting to embed 2D CLIP language features into 3D, significantly enhancing speed and learning a precise 3D language field with SAM semantics. Such advancements in 3D language fields are crucial for applications that require language interaction within complex scenes. However, LangSplat does not yet achieve real-time inference performance (8.2 FPS), even with advanced A100 GPUs, severely limiting its broader application. In this paper, we first conduct a detailed time analysis of LangSplat, identifying the heavyweight decoder as the primary speed bottleneck. Our solution, LangSplatV2 assumes that each Gaussian acts as a sparse code within a global dictionary, leading to the learning of a 3D sparse coefficient field that entirely eliminates the need for a heavyweight decoder. By leveraging this sparsity, we further propose an efficient sparse coefficient splatting method with CUDA optimization, rendering high-dimensional feature maps at high quality while incurring only the time cost of splatting an ultra-low-dimensional feature. Our experimental results demonstrate that LangSplatV2 not only achieves better or competitive query accuracy but is also significantly faster. Codes and demos are available at our project page: https://langsplat-v2.github.io.
PDF191July 11, 2025