GenStereo:邁向立體影像的開放世界生成與無監督匹配
GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching
March 17, 2025
作者: Feng Qiao, Zhexiao Xiong, Eric Xing, Nathan Jacobs
cs.AI
摘要
立體影像是許多應用的基礎,包括擴展現實(XR)設備、自動駕駛和機器人技術。然而,由於雙攝像頭設置的精確校準要求以及獲取準確、密集視差圖的複雜性,獲取高質量的立體影像仍然具有挑戰性。現有的立體影像生成方法通常專注於觀看的視覺質量或匹配的幾何精度,但無法兼顧兩者。我們引入了GenStereo,這是一種基於擴散的方法,以彌補這一差距。該方法包括兩項主要創新:(1)在擴散過程中基於視差感知的座標嵌入和變形輸入圖像進行條件化,從而實現比以往方法更精確的立體對齊;(2)一種自適應融合機制,智能地將擴散生成的圖像與變形圖像結合,提高真實感和視差一致性。通過在11個多樣化的立體數據集上進行廣泛訓練,GenStereo展示了強大的泛化能力。GenStereo在立體影像生成和無監督立體匹配任務中均達到了最先進的性能。我們的框架消除了對複雜硬件設置的需求,同時實現了高質量的立體影像生成,使其在實際應用和無監督學習場景中都具有重要價值。項目頁面可在https://qjizhi.github.io/genstereo查看。
English
Stereo images are fundamental to numerous applications, including extended
reality (XR) devices, autonomous driving, and robotics. Unfortunately,
acquiring high-quality stereo images remains challenging due to the precise
calibration requirements of dual-camera setups and the complexity of obtaining
accurate, dense disparity maps. Existing stereo image generation methods
typically focus on either visual quality for viewing or geometric accuracy for
matching, but not both. We introduce GenStereo, a diffusion-based approach, to
bridge this gap. The method includes two primary innovations (1) conditioning
the diffusion process on a disparity-aware coordinate embedding and a warped
input image, allowing for more precise stereo alignment than previous methods,
and (2) an adaptive fusion mechanism that intelligently combines the
diffusion-generated image with a warped image, improving both realism and
disparity consistency. Through extensive training on 11 diverse stereo
datasets, GenStereo demonstrates strong generalization ability. GenStereo
achieves state-of-the-art performance in both stereo image generation and
unsupervised stereo matching tasks. Our framework eliminates the need for
complex hardware setups while enabling high-quality stereo image generation,
making it valuable for both real-world applications and unsupervised learning
scenarios. Project page is available at https://qjizhi.github.io/genstereoSummary
AI-Generated Summary