ChatPaper.aiChatPaper

立体空间:基于规范空间端到端扩散的无深度立体几何合成

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

December 11, 2025
作者: Tjark Behrens, Anton Obukhov, Bingxin Ke, Fabio Tosi, Matteo Poggi, Konrad Schindler
cs.AI

摘要

我们提出StereoSpace——一种基于扩散模型的单目到立体合成框架,该框架仅通过视角条件建模几何关系,无需显式深度或形变操作。通过构建规范矫正空间与条件引导机制,生成器能够端到端地推断对应关系并补全遮挡区域。为确保公平无泄漏的评估,我们建立了一套端到端评测协议,在测试阶段完全排除真实几何数据或代理几何估计的干扰。该协议重点关注反映下游应用价值的指标:感知舒适度的iSQoE指标与几何一致性的MEt3R指标。StereoSpace在形变修补、潜在空间形变和条件形变三类方法中均实现超越,在层叠场景与非朗伯场景下均能生成锐利视差并保持强鲁棒性。这确立了视角条件化扩散模型作为无需深度信息的立体生成方案的扩展性优势。
English
We introduce StereoSpace, a diffusion-based framework for monocular-to-stereo synthesis that models geometry purely through viewpoint conditioning, without explicit depth or warping. A canonical rectified space and the conditioning guide the generator to infer correspondences and fill disocclusions end-to-end. To ensure fair and leakage-free evaluation, we introduce an end-to-end protocol that excludes any ground truth or proxy geometry estimates at test time. The protocol emphasizes metrics reflecting downstream relevance: iSQoE for perceptual comfort and MEt3R for geometric consistency. StereoSpace surpasses other methods from the warp & inpaint, latent-warping, and warped-conditioning categories, achieving sharp parallax and strong robustness on layered and non-Lambertian scenes. This establishes viewpoint-conditioned diffusion as a scalable, depth-free solution for stereo generation.
PDF71December 13, 2025