ChatPaper.aiChatPaper

StereoAdapter-2:全局结构一致的水下立体深度估计

StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation

February 18, 2026
作者: Zeyu Ren, Xiang Li, Yiran Wang, Zeyu Zhang, Hao Tang
cs.AI

摘要

立体视觉深度估计是水下机器人感知的基础任务,但受波长依赖性光衰减、散射和折射影响易出现严重域偏移。现有方法通过基于GRU的迭代优化机制适配单眼基础模型,但GRU的顺序门控与局部卷积核需多次迭代才能实现长距离视差传播,在大视差及纹理缺失水域表现受限。本文提出StereoAdapter-2,采用基于选择性状态空间模型的新型ConvSS2D算子替代传统ConvGRU更新器。该算子通过四向扫描策略自然契合极线几何,同时保持垂直结构一致性,能以线性计算复杂度在单次更新中实现高效长距离空间传播。此外,我们通过融合语义感知风格迁移与几何一致新视角合成的两阶段生成流程,构建了包含多样化基线、衰减系数和散射参数的大规模合成水下立体数据集UW-StereoDepth-80K。结合继承自StereoAdapter的动态LoRA适配机制,本框架在水下基准测试中实现零样本状态最优性能:TartanAir-UW指标提升17%,SQUID提升7.2%,在BlueROV2平台的实际验证进一步证明了方法的鲁棒性。代码与项目网站详见:https://github.com/AIGeeksGroup/StereoAdapter-2 与 https://aigeeksgroup.github.io/StereoAdapter-2。
English
Stereo depth estimation is fundamental to underwater robotic perception, yet suffers from severe domain shifts caused by wavelength-dependent light attenuation, scattering, and refraction. Recent approaches leverage monocular foundation models with GRU-based iterative refinement for underwater adaptation; however, the sequential gating and local convolutional kernels in GRUs necessitate multiple iterations for long-range disparity propagation, limiting performance in large-disparity and textureless underwater regions. In this paper, we propose StereoAdapter-2, which replaces the conventional ConvGRU updater with a novel ConvSS2D operator based on selective state space models. The proposed operator employs a four-directional scanning strategy that naturally aligns with epipolar geometry while capturing vertical structural consistency, enabling efficient long-range spatial propagation within a single update step at linear computational complexity. Furthermore, we construct UW-StereoDepth-80K, a large-scale synthetic underwater stereo dataset featuring diverse baselines, attenuation coefficients, and scattering parameters through a two-stage generative pipeline combining semantic-aware style transfer and geometry-consistent novel view synthesis. Combined with dynamic LoRA adaptation inherited from StereoAdapter, our framework achieves state-of-the-art zero-shot performance on underwater benchmarks with 17% improvement on TartanAir-UW and 7.2% improvment on SQUID, with real-world validation on the BlueROV2 platform demonstrates the robustness of our approach. Code: https://github.com/AIGeeksGroup/StereoAdapter-2. Website: https://aigeeksgroup.github.io/StereoAdapter-2.
PDF01February 21, 2026