Test3R:在测试时学习三维重建
Test3R: Learning to Reconstruct 3D at Test Time
June 16, 2025
作者: Yuheng Yuan, Qiuhong Shen, Shizun Wang, Xingyi Yang, Xinchao Wang
cs.AI
摘要
诸如DUSt3R这样的密集匹配方法通过回归成对点云图进行三维重建。然而,依赖成对预测及有限的泛化能力,本质上制约了全局几何一致性。在本研究中,我们提出了Test3R,一种出奇简单的测试时学习技术,显著提升了几何精度。利用图像三元组(I_1,I_2,I_3),Test3R从图像对(I_1,I_2)和(I_1,I_3)生成重建结果。其核心思想是通过自监督目标在测试时优化网络:最大化这两次重建相对于共同图像I_1的几何一致性。这确保了模型无论输入如何,都能产生跨图像对一致性的输出。大量实验证明,我们的技术在三维重建和多视角深度估计任务上显著超越了先前的最先进方法。此外,该技术具有普适性且几乎无额外成本,易于应用于其他模型,并以极小的测试时训练开销和参数占用实现。代码已发布于https://github.com/nopQAQ/Test3R。
English
Dense matching methods like DUSt3R regress pairwise pointmaps for 3D
reconstruction. However, the reliance on pairwise prediction and the limited
generalization capability inherently restrict the global geometric consistency.
In this work, we introduce Test3R, a surprisingly simple test-time learning
technique that significantly boosts geometric accuracy. Using image triplets
(I_1,I_2,I_3), Test3R generates reconstructions from pairs (I_1,I_2) and
(I_1,I_3). The core idea is to optimize the network at test time via a
self-supervised objective: maximizing the geometric consistency between these
two reconstructions relative to the common image I_1. This ensures the model
produces cross-pair consistent outputs, regardless of the inputs. Extensive
experiments demonstrate that our technique significantly outperforms previous
state-of-the-art methods on the 3D reconstruction and multi-view depth
estimation tasks. Moreover, it is universally applicable and nearly cost-free,
making it easily applied to other models and implemented with minimal test-time
training overhead and parameter footprint. Code is available at
https://github.com/nopQAQ/Test3R.