ShapeSplat:一個大規模的高斯斑點數據集及其自監督預訓練
ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining
August 20, 2024
作者: Qi Ma, Yue Li, Bin Ren, Nicu Sebe, Ender Konukoglu, Theo Gevers, Luc Van Gool, Danda Pani Paudel
cs.AI
摘要
3D高斯點陣(3DGS)已成為許多視覺任務中3D表示的事實標準方法。這要求直接在該表示空間中進行3D理解。為了促進在這個方向上的研究,我們首先利用常用的ShapeNet和ModelNet數據集構建了一個大規模的3DGS數據集。我們的數據集ShapeSplat包含來自87個獨特類別的65K個物體,其標籤與相應的數據集一致。創建這個數據集使用了相當於在TITAN XP GPU上進行2 GPU年的計算。
我們利用我們的數據集進行無監督預訓練和監督微調,用於分類和分割任務。為此,我們引入了\textit{高斯-MAE},突出了從高斯參數進行表示學習的獨特好處。通過詳盡的實驗,我們提供了一些有價值的見解。特別地,我們展示了(1) 優化的GS中心的分佈與均勻採樣的點雲(用於初始化)對應物明顯不同;(2) 這種分佈變化導致在僅使用中心時分類下降但在分割任務中改善;(3) 為了利用額外的高斯參數,我們提出了在歸一化特徵空間中的高斯特徵分組,以及splats池層,提供了一個定制解決方案,有效地將相似的高斯分組和嵌入,從而在微調任務中實現顯著改進。
English
3D Gaussian Splatting (3DGS) has become the de facto method of 3D
representation in many vision tasks. This calls for the 3D understanding
directly in this representation space. To facilitate the research in this
direction, we first build a large-scale dataset of 3DGS using the commonly used
ShapeNet and ModelNet datasets. Our dataset ShapeSplat consists of 65K objects
from 87 unique categories, whose labels are in accordance with the respective
datasets. The creation of this dataset utilized the compute equivalent of 2 GPU
years on a TITAN XP GPU.
We utilize our dataset for unsupervised pretraining and supervised finetuning
for classification and segmentation tasks. To this end, we introduce
\textit{Gaussian-MAE}, which highlights the unique benefits of
representation learning from Gaussian parameters. Through exhaustive
experiments, we provide several valuable insights. In particular, we show that
(1) the distribution of the optimized GS centroids significantly differs from
the uniformly sampled point cloud (used for initialization) counterpart; (2)
this change in distribution results in degradation in classification but
improvement in segmentation tasks when using only the centroids; (3) to
leverage additional Gaussian parameters, we propose Gaussian feature grouping
in a normalized feature space, along with splats pooling layer, offering a
tailored solution to effectively group and embed similar Gaussians, which leads
to notable improvement in finetuning tasks.Summary
AI-Generated Summary