SAMPart3D：3Dオブジェクト内の任意の部分をセグメント化

要旨

3D部品セグメンテーションは、3D認識において重要であり、挑戦的なタスクであり、ロボティクス、3D生成、および3D編集などのアプリケーションにおいて重要な役割を果たしています。最近の手法は、強力なビジョン言語モデル（VLMs）を活用して、2Dから3Dの知識蒸留を行い、ゼロショット3D部品セグメンテーションを達成しています。しかし、これらの手法はテキストプロンプトへの依存によって制限されており、大規模な未ラベルデータセットへの拡張性や部品の曖昧さを処理する柔軟性が制限されています。本研究では、事前定義された部品ラベルセットをテキストプロンプトとして必要とせず、どんな3Dオブジェクトでも複数の粒度で意味的な部品にセグメントするスケーラブルなゼロショット3D部品セグメンテーションフレームワークであるSAMPart3Dを紹介します。スケーラビリティのために、テキストに依存しないビジョン基盤モデルを使用して3D特徴抽出バックボーンを蒸留し、豊富な3D事前知識を学習するために大規模な未ラベル3Dデータセットにスケーリングします。柔軟性のために、スケール条件付きの部品感知3D特徴を蒸留して、複数の粒度で3D部品セグメンテーションを行います。スケール条件付きの部品感知3D特徴からセグメントされた部品を取得した後、マルチビューレンダリングに基づいて各部品に意味的なラベルを割り当てるためにVLMsを使用します。従来の手法と比較して、SAMPart3Dは、最近の大規模3DオブジェクトデータセットObjaverseにスケーリングし、複雑で非常奇抜なオブジェクトを処理することができます。さらに、既存のベンチマークにおけるオブジェクトと部品の多様性と複雑さの不足を解消するために、新しい3D部品セグメンテーションベンチマークを提供しています。実験結果は、SAMPart3Dが既存のゼロショット3D部品セグメンテーション手法を大幅に上回り、部品レベルの編集やインタラクティブなセグメンテーションなどのさまざまなアプリケーションを促進できることを示しています。

English

3D part segmentation is a crucial and challenging task in 3D perception, playing a vital role in applications such as robotics, 3D generation, and 3D editing. Recent methods harness the powerful Vision Language Models (VLMs) for 2D-to-3D knowledge distillation, achieving zero-shot 3D part segmentation. However, these methods are limited by their reliance on text prompts, which restricts the scalability to large-scale unlabeled datasets and the flexibility in handling part ambiguities. In this work, we introduce SAMPart3D, a scalable zero-shot 3D part segmentation framework that segments any 3D object into semantic parts at multiple granularities, without requiring predefined part label sets as text prompts. For scalability, we use text-agnostic vision foundation models to distill a 3D feature extraction backbone, allowing scaling to large unlabeled 3D datasets to learn rich 3D priors. For flexibility, we distill scale-conditioned part-aware 3D features for 3D part segmentation at multiple granularities. Once the segmented parts are obtained from the scale-conditioned part-aware 3D features, we use VLMs to assign semantic labels to each part based on the multi-view renderings. Compared to previous methods, our SAMPart3D can scale to the recent large-scale 3D object dataset Objaverse and handle complex, non-ordinary objects. Additionally, we contribute a new 3D part segmentation benchmark to address the lack of diversity and complexity of objects and parts in existing benchmarks. Experiments show that our SAMPart3D significantly outperforms existing zero-shot 3D part segmentation methods, and can facilitate various applications such as part-level editing and interactive segmentation.

SAMPart3D：3Dオブジェクト内の任意の部分をセグメント化

SAMPart3D: Segment Any Part in 3D Objects

要旨

Support