ChatPaper.aiChatPaper

Sonata:可靠點雲表徵的自監督學習

Sonata: Self-Supervised Learning of Reliable Point Representations

March 20, 2025
作者: Xiaoyang Wu, Daniel DeTone, Duncan Frost, Tianwei Shen, Chris Xie, Nan Yang, Jakob Engel, Richard Newcombe, Hengshuang Zhao, Julian Straub
cs.AI

摘要

在本篇論文中,我們探討是否存在一種可靠的自監督點雲模型,能夠透過簡單的線性探測應用於多樣的三維任務,即便在數據有限且計算資源最少的情況下。我們發現,現有的三維自監督學習方法在通過線性探測評估表示質量時表現欠佳。我們假設這是由於我們所稱的「幾何捷徑」所致,這導致表示塌陷為低層次的空間特徵。這一挑戰是三維領域特有的,源於點雲數據的稀疏性。我們通過兩種關鍵策略來應對這一問題:遮蔽空間信息以及增強對輸入特徵的依賴,最終通過自蒸餾構建了一個包含14萬個點雲的Sonata模型。Sonata既簡單直觀,其學習到的表示又強健可靠:零樣本可視化展示了語義分組,以及通過最近鄰關係展現的強大空間推理能力。Sonata展現了卓越的參數和數據效率,在ScanNet上的線性探測準確率提升了三倍(從21.8%增至72.5%),並且僅使用1%的數據就幾乎使性能翻倍,相較於先前的方法。全面微調更進一步提升了在三維室內外感知任務上的最新技術水平。
English
In this paper, we question whether we have a reliable self-supervised point cloud model that can be used for diverse 3D tasks via simple linear probing, even with limited data and minimal computation. We find that existing 3D self-supervised learning approaches fall short when evaluated on representation quality through linear probing. We hypothesize that this is due to what we term the "geometric shortcut", which causes representations to collapse to low-level spatial features. This challenge is unique to 3D and arises from the sparse nature of point cloud data. We address it through two key strategies: obscuring spatial information and enhancing the reliance on input features, ultimately composing a Sonata of 140k point clouds through self-distillation. Sonata is simple and intuitive, yet its learned representations are strong and reliable: zero-shot visualizations demonstrate semantic grouping, alongside strong spatial reasoning through nearest-neighbor relationships. Sonata demonstrates exceptional parameter and data efficiency, tripling linear probing accuracy (from 21.8% to 72.5%) on ScanNet and nearly doubling performance with only 1% of the data compared to previous approaches. Full fine-tuning further advances SOTA across both 3D indoor and outdoor perception tasks.

Summary

AI-Generated Summary

PDF112March 21, 2025