ChatPaper.aiChatPaper

RoMa v2:更強更快更密集的特徵匹配

RoMa v2: Harder Better Faster Denser Feature Matching

November 19, 2025
作者: Johan Edstedt, David Nordström, Yushan Zhang, Georg Bökman, Jonathan Astermark, Viktor Larsson, Anders Heyden, Fredrik Kahl, Mårten Wadenbäck, Michael Felsberg
cs.AI

摘要

密集特徵匹配旨在估算三維場景中兩張影像之間的所有對應關係,近期因其高精度與強健性已成為業界黃金標準。然而,現有密集匹配器在許多困難的現實場景中仍會失效或表現不佳,且高精度模型往往速度緩慢,限制了其實用性。本文透過一系列系統性改進多面向攻克這些弱點,共同構建出顯著更優的模型。我們特別設計了新穎的匹配架構與損失函數,結合精心策劃的多樣化訓練資料分佈,使模型能解決諸多複雜匹配任務。此外,我們透過解耦的兩階段「匹配-優化」流程加速訓練,同時利用定制化CUDA核心大幅降低優化階段的記憶體消耗。最後,我們整合近期DINOv3基礎模型與多項創新洞見,提升模型的強健性與無偏性。在大量實驗中驗證,最終的新型匹配器創下全新標竿,其準確度顯著超越前人成果。程式碼公開於:https://github.com/Parskatt/romav2
English
Dense feature matching aims to estimate all correspondences between two images of a 3D scene and has recently been established as the gold-standard due to its high accuracy and robustness. However, existing dense matchers still fail or perform poorly for many hard real-world scenarios, and high-precision models are often slow, limiting their applicability. In this paper, we attack these weaknesses on a wide front through a series of systematic improvements that together yield a significantly better model. In particular, we construct a novel matching architecture and loss, which, combined with a curated diverse training distribution, enables our model to solve many complex matching tasks. We further make training faster through a decoupled two-stage matching-then-refinement pipeline, and at the same time, significantly reduce refinement memory usage through a custom CUDA kernel. Finally, we leverage the recent DINOv3 foundation model along with multiple other insights to make the model more robust and unbiased. In our extensive set of experiments we show that the resulting novel matcher sets a new state-of-the-art, being significantly more accurate than its predecessors. Code is available at https://github.com/Parskatt/romav2
PDF62December 2, 2025