α深度：面向立体转换的单遍软边界分解学习

摘要

在立体转换中，精确建模软边界（如毛发和散焦模糊）是一项基础挑战，原因在于前景与背景的混合具有模糊性。现有深度模型主要预测单层深度，导致软边界处深度对应关系存在歧义。尽管抠图技术能够捕获不透明度以实现分层建模，但在包含多个目标的复杂场景中往往表现不佳，且通常需要人工干预。本文提出αDepth——一种通过分解软边界实现高保真立体转换的分层表示方法。具体而言，我们首先通过估计软边界处的分层颜色与深度值，解决混合颜色与深度的歧义问题。针对包含多个目标的复杂场景，我们设计了圆形Alpha表示（CAR），将范式从全局目标提取转向局部边界分解。与先前局限于单一前景/背景的抠图方法不同，CAR无需人工引导即可实现高效的场景级推理。大量评估表明，αDepth在立体转换中实现了最先进的性能，消除了软边界处的背景渗漏与结构畸变。

English

Accurately modeling soft boundaries, e.g., hair and defocus blur, is a fundamental challenge in stereo conversion due to the ambiguous blending of foreground and background. Existing depth models primarily predict single-layer depth, leading to ambiguity in depth correspondence at soft boundaries. While matting techniques can capture opacity for layered modeling, they often struggle in complex scenes with multiple targets and usually require user intervention. This paper introduces αDepth, a layered representation that decomposes soft boundaries for high-fidelity stereo conversion. Specifically, we first resolve mixed color and depth ambiguity by estimating layered color and depth values at soft boundaries. Considering complex multi-target scenes, we design a Circular Alpha Representation (CAR) that shifts the paradigm from global target extraction to local boundary decomposition. Unlike prior matting methods restricted to a single foreground/background, CAR enables efficient scene-level inference without manual guidance. Extensive evaluations demonstrate that αDepth achieves state-of-the-art performance in stereo conversion, eliminating background bleeding and structural distortions at soft boundaries.