PointInfinity: 解像度不変なポイント拡散モデル

要旨

本論文では、効率的な点群拡散モデルのファミリーであるPointInfinityを提案する。中核となるアイデアは、固定サイズで解像度不変な潜在表現を用いたトランスフォーマーベースのアーキテクチャを採用することである。これにより、低解像度の点群を用いた効率的な学習が可能となりつつ、推論時には高解像度の点群を生成できる。さらに重要なことに、学習時の解像度を超えて推論時の解像度をスケールアップすることで、生成される点群と表面の忠実度が向上することを示す。この現象を分析し、拡散モデルで一般的に使用されるclassifier-free guidanceとの関連性を明らかにすることで、両者が推論時の忠実度と多様性のトレードオフを可能にすることを実証する。CO3Dでの実験により、PointInfinityが最先端の品質で高解像度の点群（最大131kポイント、Point-Eの31倍）を効率的に生成できることを示す。

English

We present PointInfinity, an efficient family of point cloud diffusion models. Our core idea is to use a transformer-based architecture with a fixed-size, resolution-invariant latent representation. This enables efficient training with low-resolution point clouds, while allowing high-resolution point clouds to be generated during inference. More importantly, we show that scaling the test-time resolution beyond the training resolution improves the fidelity of generated point clouds and surfaces. We analyze this phenomenon and draw a link to classifier-free guidance commonly used in diffusion models, demonstrating that both allow trading off fidelity and variability during inference. Experiments on CO3D show that PointInfinity can efficiently generate high-resolution point clouds (up to 131k points, 31 times more than Point-E) with state-of-the-art quality.

PointInfinity: 解像度不変なポイント拡散モデル

PointInfinity: Resolution-Invariant Point Diffusion Models

要旨

Support