ChatPaper.aiChatPaper

GenStereo:邁向立體影像的開放世界生成與無監督匹配

GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

March 17, 2025
作者: Feng Qiao, Zhexiao Xiong, Eric Xing, Nathan Jacobs
cs.AI

摘要

立體影像是許多應用的基礎,包括擴展現實(XR)設備、自動駕駛和機器人技術。然而,由於雙攝像頭設置的精確校準要求以及獲取準確、密集視差圖的複雜性,獲取高質量的立體影像仍然具有挑戰性。現有的立體影像生成方法通常專注於觀看的視覺質量或匹配的幾何精度,但無法兼顧兩者。我們引入了GenStereo,這是一種基於擴散的方法,以彌補這一差距。該方法包括兩項主要創新:(1)在擴散過程中基於視差感知的座標嵌入和變形輸入圖像進行條件化,從而實現比以往方法更精確的立體對齊;(2)一種自適應融合機制,智能地將擴散生成的圖像與變形圖像結合,提高真實感和視差一致性。通過在11個多樣化的立體數據集上進行廣泛訓練,GenStereo展示了強大的泛化能力。GenStereo在立體影像生成和無監督立體匹配任務中均達到了最先進的性能。我們的框架消除了對複雜硬件設置的需求,同時實現了高質量的立體影像生成,使其在實際應用和無監督學習場景中都具有重要價值。項目頁面可在https://qjizhi.github.io/genstereo查看。
English
Stereo images are fundamental to numerous applications, including extended reality (XR) devices, autonomous driving, and robotics. Unfortunately, acquiring high-quality stereo images remains challenging due to the precise calibration requirements of dual-camera setups and the complexity of obtaining accurate, dense disparity maps. Existing stereo image generation methods typically focus on either visual quality for viewing or geometric accuracy for matching, but not both. We introduce GenStereo, a diffusion-based approach, to bridge this gap. The method includes two primary innovations (1) conditioning the diffusion process on a disparity-aware coordinate embedding and a warped input image, allowing for more precise stereo alignment than previous methods, and (2) an adaptive fusion mechanism that intelligently combines the diffusion-generated image with a warped image, improving both realism and disparity consistency. Through extensive training on 11 diverse stereo datasets, GenStereo demonstrates strong generalization ability. GenStereo achieves state-of-the-art performance in both stereo image generation and unsupervised stereo matching tasks. Our framework eliminates the need for complex hardware setups while enabling high-quality stereo image generation, making it valuable for both real-world applications and unsupervised learning scenarios. Project page is available at https://qjizhi.github.io/genstereo

Summary

AI-Generated Summary

PDF43March 18, 2025