自由几何：基于自身长序列优化的三维重建精炼

摘要

前馈式三维重建模型虽效率高但缺乏灵活性：一旦训练完成，其推理过程采用零样本方式，无法适配测试场景。这导致视觉上合理的重建结果常存在误差，尤其在遮挡、镜面反射和模糊线索等情况下。为此，我们提出自由几何框架（Free Geometry），使前馈式三维重建模型能在无三维真值标注的情况下实现测试时的自我进化。核心发现是：当模型获取更多视角时，其重建结果会更可靠且具备视角一致性。基于此特性，我们对测试序列进行帧掩码以构建自监督任务。该框架通过对比完整观测与部分观测的跨视角特征一致性，同时保持被遮蔽帧所隐含的成对关系，利用轻量级LoRA更新实现快速参数校准（单GPU每数据集耗时不足2分钟）。在4个基准数据集上的实验表明，本方法能持续提升包括Depth Anything 3和VGGT在内的前沿基础模型性能，相机位姿精度平均提升3.73%，点云图预测精度平均提升2.88%。代码已开源：https://github.com/hiteacherIamhumble/Free-Geometry。

English

Feed-forward 3D reconstruction models are efficient but rigid: once trained, they perform inference in a zero-shot manner and cannot adapt to the test scene. As a result, visually plausible reconstructions often contain errors, particularly under occlusions, specularities, and ambiguous cues. To address this, we introduce Free Geometry, a framework that enables feed-forward 3D reconstruction models to self-evolve at test time without any 3D ground truth. Our key insight is that, when the model receives more views, it produces more reliable and view-consistent reconstructions. Leveraging this property, given a testing sequence, we mask a subset of frames to construct a self-supervised task. Free Geometry enforces cross-view feature consistency between representations from full and partial observations, while maintaining the pairwise relations implied by the held-out frames. This self-supervision allows for fast recalibration via lightweight LoRA updates, taking less than 2 minutes per dataset on a single GPU. Our approach consistently improves state-of-the-art foundation models, including Depth Anything 3 and VGGT, across 4 benchmark datasets, yielding an average improvement of 3.73% in camera pose accuracy and 2.88% in point map prediction. Code is available at https://github.com/hiteacherIamhumble/Free-Geometry .