PoseDiffusion：透過擴散輔助的束調整解決姿勢估計

摘要

相機姿勢估計是一個歷史悠久的計算机視覺問題，迄今經常依賴於傳統方法，如手工設計的關鍵點匹配、RANSAC和束調整。本文提出在概率擴散框架內制定結構從運動（SfM）問題，對輸入圖像給出相機姿勢的條件分佈。這種對一個古老問題的新觀點具有幾個優勢。 (i) 擴散框架的性質反映了束調整的迭代過程。 (ii) 該公式允許從基線幾何中無縫集成幾何約束。 (iii) 它在典型困難情況下表現出色，如稀疏視圖和寬基線。 (iv) 該方法可以預測任意數量圖像的內部和外部參數。我們展示了我們的方法PoseDiffusion在兩個真實世界數據集上明顯優於經典SfM流程和學習方法。最後，觀察到我們的方法可以在無需進一步訓練的情況下橫跨數據集進行泛化。項目頁面：https://posediffusion.github.io/

English

Camera pose estimation is a long-standing computer vision problem that to date often relies on classical methods, such as handcrafted keypoint matching, RANSAC and bundle adjustment. In this paper, we propose to formulate the Structure from Motion (SfM) problem inside a probabilistic diffusion framework, modelling the conditional distribution of camera poses given input images. This novel view of an old problem has several advantages. (i) The nature of the diffusion framework mirrors the iterative procedure of bundle adjustment. (ii) The formulation allows a seamless integration of geometric constraints from epipolar geometry. (iii) It excels in typically difficult scenarios such as sparse views with wide baselines. (iv) The method can predict intrinsics and extrinsics for an arbitrary amount of images. We demonstrate that our method PoseDiffusion significantly improves over the classic SfM pipelines and the learned approaches on two real-world datasets. Finally, it is observed that our method can generalize across datasets without further training. Project page: https://posediffusion.github.io/

PoseDiffusion：透過擴散輔助的束調整解決姿勢估計

PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

摘要

Support