C-GenReg：基于多视图一致几何到图像生成与概率模态融合的无训练三维点云配准方法

摘要

我们提出C-GenReg——一种基于世界尺度生成先验与面向配准的视觉基础模型（VFM）互补优势的无训练三维点云配准框架。当前基于学习的三维点云配准方法难以跨越传感模态、采样差异和环境变化实现泛化。为此，C-GenReg通过世界基础模型将输入几何数据合成为多视角一致的RGB表征，将匹配问题迁移至VFM擅长的辅助图像域，从而扩展了几何点云配准分支。这种生成式迁移无需微调即可保持源视角与目标视角间的空间一致性。从生成视角中，经稠密对应关系预训练的VFM可提取匹配点，最终通过原始深度图将像素对应关系映射回三维空间。为进一步增强鲁棒性，我们提出"匹配-融合"概率冷融合机制，将生成RGB分支与原始几何分支的两个独立对应后验分布进行结合。这种原理性融合既保留了各模态的归纳偏置，又能在无需额外学习的情况下提供校准置信度。C-GenReg具备零样本即插即用特性：所有模块均经预训练且无需微调即可运行。在室内（3DMatch、ScanNet）和室外（Waymo）基准测试上的大量实验表明，该框架具有强大的零样本性能与卓越的跨域泛化能力。我们首次实现了在真实室外LiDAR数据（无图像数据可用）上成功运行的生成式配准框架。

English

We introduce C-GenReg, a training-free framework for 3D point cloud registration that leverages the complementary strengths of world-scale generative priors and registration-oriented Vision Foundation Models (VFMs). Current learning-based 3D point cloud registration methods struggle to generalize across sensing modalities, sampling differences, and environments. Hence, C-GenReg augments the geometric point cloud registration branch by transferring the matching problem into an auxiliary image domain, where VFMs excel, using a World Foundation Model to synthesize multi-view-consistent RGB representations from the input geometry. This generative transfer, preserves spatial coherence across source and target views without any fine-tuning. From these generated views, a VFM pretrained for finding dense correspondences extracts matches. The resulting pixel correspondences are lifted back to 3D via the original depth maps. To further enhance robustness, we introduce a "Match-then-Fuse" probabilistic cold-fusion scheme that combines two independent correspondence posteriors, that of the generated-RGB branch with that of the raw geometric branch. This principled fusion preserves each modality inductive bias and provides calibrated confidence without any additional learning. C-GenReg is zero-shot and plug-and-play: all modules are pretrained and operate without fine-tuning. Extensive experiments on indoor (3DMatch, ScanNet) and outdoor (Waymo) benchmarks demonstrate strong zero-shot performance and superior cross-domain generalization. For the first time, we demonstrate a generative registration framework that operates successfully on real outdoor LiDAR data, where no imagery data is available.

C-GenReg：基于多视图一致几何到图像生成与概率模态融合的无训练三维点云配准方法

C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion

摘要

Support