ChatPaper.aiChatPaper

GenStereo:迈向开放世界的立体图像生成与无监督匹配

GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

March 17, 2025
作者: Feng Qiao, Zhexiao Xiong, Eric Xing, Nathan Jacobs
cs.AI

摘要

立体图像在诸多应用中扮演着基础角色,包括扩展现实(XR)设备、自动驾驶以及机器人技术。然而,获取高质量的立体图像仍面临挑战,这主要源于双摄像头系统所需的精确校准以及获取准确密集视差图的复杂性。现有的立体图像生成方法往往侧重于视觉质量以优化观看体验,或几何精度以提升匹配效果,但鲜有兼顾两者。为此,我们提出了GenStereo,一种基于扩散模型的方法,旨在弥合这一鸿沟。该方法包含两大创新点:(1) 在扩散过程中引入视差感知坐标嵌入与变形输入图像作为条件,从而实现了比以往方法更为精确的立体对齐;(2) 采用自适应融合机制,智能地将扩散生成的图像与变形图像结合,既增强了真实感,又保证了视差一致性。通过在11个多样化的立体数据集上进行广泛训练,GenStereo展现了强大的泛化能力,在立体图像生成和无监督立体匹配任务中均达到了业界领先水平。我们的框架不仅免除了复杂硬件配置的需求,还能生成高质量的立体图像,这对于实际应用场景和无监督学习环境均具有重要价值。项目页面请访问:https://qjizhi.github.io/genstereo。
English
Stereo images are fundamental to numerous applications, including extended reality (XR) devices, autonomous driving, and robotics. Unfortunately, acquiring high-quality stereo images remains challenging due to the precise calibration requirements of dual-camera setups and the complexity of obtaining accurate, dense disparity maps. Existing stereo image generation methods typically focus on either visual quality for viewing or geometric accuracy for matching, but not both. We introduce GenStereo, a diffusion-based approach, to bridge this gap. The method includes two primary innovations (1) conditioning the diffusion process on a disparity-aware coordinate embedding and a warped input image, allowing for more precise stereo alignment than previous methods, and (2) an adaptive fusion mechanism that intelligently combines the diffusion-generated image with a warped image, improving both realism and disparity consistency. Through extensive training on 11 diverse stereo datasets, GenStereo demonstrates strong generalization ability. GenStereo achieves state-of-the-art performance in both stereo image generation and unsupervised stereo matching tasks. Our framework eliminates the need for complex hardware setups while enabling high-quality stereo image generation, making it valuable for both real-world applications and unsupervised learning scenarios. Project page is available at https://qjizhi.github.io/genstereo

Summary

AI-Generated Summary

PDF43March 18, 2025