ChatPaper.aiChatPaper

UFM:邁向統一密集對應流動的簡明途徑

UFM: A Simple Path towards Unified Dense Correspondence with Flow

June 10, 2025
作者: Yuchen Zhang, Nikhil Keetha, Chenwei Lyu, Bhuvan Jhamb, Yutian Chen, Yuheng Qiu, Jay Karhade, Shreyas Jha, Yaoyu Hu, Deva Ramanan, Sebastian Scherer, Wenshan Wang
cs.AI

摘要

密集圖像對應是許多應用的核心,例如視覺里程計、三維重建、物體關聯以及再識別。歷史上,儘管目標都是匹配兩幅圖像中的內容,但密集對應問題在寬基線場景和光流估計中一直是分開處理的。本文中,我們開發了一種統一流與匹配模型(UFM),該模型針對源圖像與目標圖像中共同可見的像素,在統一數據上進行訓練。UFM採用了一種簡單、通用的變換器架構,直接回歸(u,v)流。與先前工作中典型的由粗到細的成本體積相比,UFM更易於訓練,且對於大流動更為精確。UFM比最先進的光流方法(Unimatch)精確度提高了28%,同時比密集寬基線匹配器(RoMa)錯誤率降低了62%,速度提升了6.7倍。UFM首次證明了統一訓練能夠在兩個領域中超越專門化的方法。這一成果為快速、通用的對應提供了可能,並為多模態、長距離及實時對應任務開辟了新的研究方向。
English
Dense image correspondence is central to many applications, such as visual odometry, 3D reconstruction, object association, and re-identification. Historically, dense correspondence has been tackled separately for wide-baseline scenarios and optical flow estimation, despite the common goal of matching content between two images. In this paper, we develop a Unified Flow & Matching model (UFM), which is trained on unified data for pixels that are co-visible in both source and target images. UFM uses a simple, generic transformer architecture that directly regresses the (u,v) flow. It is easier to train and more accurate for large flows compared to the typical coarse-to-fine cost volumes in prior work. UFM is 28% more accurate than state-of-the-art flow methods (Unimatch), while also having 62% less error and 6.7x faster than dense wide-baseline matchers (RoMa). UFM is the first to demonstrate that unified training can outperform specialized approaches across both domains. This result enables fast, general-purpose correspondence and opens new directions for multi-modal, long-range, and real-time correspondence tasks.
PDF52June 12, 2025