GenStereo:迈向开放世界的立体图像生成与无监督匹配
GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching
March 17, 2025
作者: Feng Qiao, Zhexiao Xiong, Eric Xing, Nathan Jacobs
cs.AI
摘要
立体图像在诸多应用中扮演着基础角色,包括扩展现实(XR)设备、自动驾驶以及机器人技术。然而,获取高质量的立体图像仍面临挑战,这主要源于双摄像头系统所需的精确校准以及获取准确密集视差图的复杂性。现有的立体图像生成方法往往侧重于视觉质量以优化观看体验,或几何精度以提升匹配效果,但鲜有兼顾两者。为此,我们提出了GenStereo,一种基于扩散模型的方法,旨在弥合这一鸿沟。该方法包含两大创新点:(1) 在扩散过程中引入视差感知坐标嵌入与变形输入图像作为条件,从而实现了比以往方法更为精确的立体对齐;(2) 采用自适应融合机制,智能地将扩散生成的图像与变形图像结合,既增强了真实感,又保证了视差一致性。通过在11个多样化的立体数据集上进行广泛训练,GenStereo展现了强大的泛化能力,在立体图像生成和无监督立体匹配任务中均达到了业界领先水平。我们的框架不仅免除了复杂硬件配置的需求,还能生成高质量的立体图像,这对于实际应用场景和无监督学习环境均具有重要价值。项目页面请访问:https://qjizhi.github.io/genstereo。
English
Stereo images are fundamental to numerous applications, including extended
reality (XR) devices, autonomous driving, and robotics. Unfortunately,
acquiring high-quality stereo images remains challenging due to the precise
calibration requirements of dual-camera setups and the complexity of obtaining
accurate, dense disparity maps. Existing stereo image generation methods
typically focus on either visual quality for viewing or geometric accuracy for
matching, but not both. We introduce GenStereo, a diffusion-based approach, to
bridge this gap. The method includes two primary innovations (1) conditioning
the diffusion process on a disparity-aware coordinate embedding and a warped
input image, allowing for more precise stereo alignment than previous methods,
and (2) an adaptive fusion mechanism that intelligently combines the
diffusion-generated image with a warped image, improving both realism and
disparity consistency. Through extensive training on 11 diverse stereo
datasets, GenStereo demonstrates strong generalization ability. GenStereo
achieves state-of-the-art performance in both stereo image generation and
unsupervised stereo matching tasks. Our framework eliminates the need for
complex hardware setups while enabling high-quality stereo image generation,
making it valuable for both real-world applications and unsupervised learning
scenarios. Project page is available at https://qjizhi.github.io/genstereoSummary
AI-Generated Summary