ShapeSplat:高斯斑点及其自监督预训练的大规模数据集
ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining
August 20, 2024
作者: Qi Ma, Yue Li, Bin Ren, Nicu Sebe, Ender Konukoglu, Theo Gevers, Luc Van Gool, Danda Pani Paudel
cs.AI
摘要
3D高斯点云投影(3DGS)已成为许多视觉任务中3D表示的事实标准方法。这要求直接在这种表示空间中进行3D理解。为了促进这个方向的研究,我们首先利用常用的ShapeNet和ModelNet数据集构建了一个大规模的3DGS数据集。我们的数据集ShapeSplat包含来自87个独特类别的65,000个对象,其标签与各自的数据集一致。创建这个数据集利用了相当于在TITAN XP GPU上进行2个GPU年的计算。
我们利用我们的数据集进行无监督预训练和监督微调,用于分类和分割任务。为此,我们引入了\textit{高斯-均方误差},突出了从高斯参数中学习表示的独特优势。通过详尽的实验,我们提供了一些有价值的见解。特别是,我们展示了:(1)优化的GS质心的分布与均匀采样的点云(用于初始化)对应物明显不同;(2)这种分布变化导致在仅使用质心时分类下降但在分割任务中改善;(3)为了利用额外的高斯参数,我们提出了在归一化特征空间中的高斯特征分组,以及splats池化层,提供了一个定制的解决方案,有效地对类似的高斯进行分组和嵌入,从而显著改善微调任务。
English
3D Gaussian Splatting (3DGS) has become the de facto method of 3D
representation in many vision tasks. This calls for the 3D understanding
directly in this representation space. To facilitate the research in this
direction, we first build a large-scale dataset of 3DGS using the commonly used
ShapeNet and ModelNet datasets. Our dataset ShapeSplat consists of 65K objects
from 87 unique categories, whose labels are in accordance with the respective
datasets. The creation of this dataset utilized the compute equivalent of 2 GPU
years on a TITAN XP GPU.
We utilize our dataset for unsupervised pretraining and supervised finetuning
for classification and segmentation tasks. To this end, we introduce
\textit{Gaussian-MAE}, which highlights the unique benefits of
representation learning from Gaussian parameters. Through exhaustive
experiments, we provide several valuable insights. In particular, we show that
(1) the distribution of the optimized GS centroids significantly differs from
the uniformly sampled point cloud (used for initialization) counterpart; (2)
this change in distribution results in degradation in classification but
improvement in segmentation tasks when using only the centroids; (3) to
leverage additional Gaussian parameters, we propose Gaussian feature grouping
in a normalized feature space, along with splats pooling layer, offering a
tailored solution to effectively group and embed similar Gaussians, which leads
to notable improvement in finetuning tasks.Summary
AI-Generated Summary