ChatPaper.aiChatPaper

UAGLNet:基于CNN-Transformer协同机制的不确定性聚合全局-局部融合网络及其在建筑物提取中的应用

UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction

December 15, 2025
作者: Siyuan Yao, Dongxiu Liu, Taotao Li, Shengjie Li, Wenqi Ren, Xiaochun Cao
cs.AI

摘要

遥感影像中的建筑物提取因建筑结构复杂多变而极具挑战性。现有方法虽采用卷积或自注意力模块来捕捉分割模型中的多尺度特征,但特征金字塔的固有间隙及全局-局部特征融合不足,导致提取结果存在不准确和模糊性问题。为此,本文提出一种不确定性聚合的全局-局部融合网络(UAGLNet),该网络能够在不确定性建模的指导下有效挖掘高质量的全局-局部视觉语义。具体而言,我们设计了一种新型协同编码器,通过在不同层级采用混合CNN与Transformer结构,分别捕获局部与全局视觉语义;针对网络深度增加时全局与局部特征间的差异,设计了中间协同交互模块(CIB)以缩小特征间隙;进而提出全局-局部融合(GLF)模块,以互补方式融合全局与局部表征。此外,为降低不确定区域的分割模糊性,提出不确定性聚合解码器(UAD),通过显式估计像素级不确定性来提升分割精度。大量实验表明,本方法性能优于现有先进技术。代码已开源:https://github.com/Dstate/UAGLNet
English
Building extraction from remote sensing images is a challenging task due to the complex structure variations of the buildings. Existing methods employ convolutional or self-attention blocks to capture the multi-scale features in the segmentation models, while the inherent gap of the feature pyramids and insufficient global-local feature integration leads to inaccurate, ambiguous extraction results. To address this issue, in this paper, we present an Uncertainty-Aggregated Global-Local Fusion Network (UAGLNet), which is capable to exploit high-quality global-local visual semantics under the guidance of uncertainty modeling. Specifically, we propose a novel cooperative encoder, which adopts hybrid CNN and transformer layers at different stages to capture the local and global visual semantics, respectively. An intermediate cooperative interaction block (CIB) is designed to narrow the gap between the local and global features when the network becomes deeper. Afterwards, we propose a Global-Local Fusion (GLF) module to complementarily fuse the global and local representations. Moreover, to mitigate the segmentation ambiguity in uncertain regions, we propose an Uncertainty-Aggregated Decoder (UAD) to explicitly estimate the pixel-wise uncertainty to enhance the segmentation accuracy. Extensive experiments demonstrate that our method achieves superior performance to other state-of-the-art methods. Our code is available at https://github.com/Dstate/UAGLNet
PDF11December 18, 2025