ChatPaper.aiChatPaper

普里特维-互补自适应融合编码器(CAFE):释放洪水淹没测绘的全效潜能

Prithvi-Complimentary Adaptive Fusion Encoder (CAFE): unlocking full-potential for flood inundation mapping

January 5, 2026
作者: Saurabh Kaushik, Lalit Maurya, Beth Tellman
cs.AI

摘要

地理基础模型(GFMs)在语义分割、分类和回归等多种下游任务中已被证明具有卓越性能。然而在基于Sen1Flood11数据集进行洪涝测绘的下游任务中,GFMs难以超越基线U-Net模型,这暴露出模型在捕捉关键局部细节方面的局限性。为此,我们提出Prithvi互补自适应融合编码器(CAFE),该架构将Prithvi GFM预训练编码器与卷积注意力模块(CAM)增强的并行CNN残差分支相结合。Prithvi-CAFE通过适配器实现Prithvi模型的快速高效微调,并与CNN特征进行多尺度、多层次融合,在保持长程依赖关系的同时捕获关键局部细节。我们在两个综合性洪涝测绘数据集(Sen1Flood11和FloodPlanet)上取得了最先进的结果:在Sen1Flood11测试数据中,Prithvi-CAFE(交并比83.41)优于原始Prithvi(82.50)及其他主流GFMs(TerraMind 82.90、DOFA 81.54、spectralGPT 81.02);在保留测试集上的提升更为显著,Prithvi-CAFE交并比达81.37,显著超越基线U-Net(70.57)和原始Prithvi(72.42)。在FloodPlanet数据集上,Prithvi-CAFE同样以64.70的交并比优于U-Net(60.14)、Terramind(62.33)、DOFA(59.15)和Prithvi 2.0(61.91)。我们提出的Prithvi-CAFE结构简洁而高效,对于需要融合多通道/多模态互补信息且局部细节至关重要的分割任务展现出强大潜力。代码已发布于https://github.com/Sk-2103/Prithvi-CAFE。
English
Geo-Foundation Models (GFMs), have proven effective in diverse downstream applications, including semantic segmentation, classification, and regression tasks. However, in case of flood mapping using Sen1Flood11 dataset as a downstream task, GFMs struggles to outperform the baseline U-Net, highlighting model's limitation in capturing critical local nuances. To address this, we present the Prithvi-Complementary Adaptive Fusion Encoder (CAFE), which integrate Prithvi GFM pretrained encoder with a parallel CNN residual branch enhanced by Convolutional Attention Modules (CAM). Prithvi-CAFE enables fast and efficient fine-tuning through adapters in Prithvi and performs multi-scale, multi-level fusion with CNN features, capturing critical local details while preserving long-range dependencies. We achieve state-of-the-art results on two comprehensive flood mapping datasets: Sen1Flood11 and FloodPlanet. On Sen1Flood11 test data, Prithvi-CAFE (IoU 83.41) outperforms the original Prithvi (IoU 82.50) and other major GFMs (TerraMind 82.90, DOFA 81.54, spectralGPT: 81.02). The improvement is even more pronounced on the hold-out test site, where Prithvi-CAFE achieves an IoU of 81.37 compared to the baseline U-Net (70.57) and original Prithvi (72.42). On FloodPlanet, Prithvi-CAFE also surpasses the baseline U-Net and other GFMs, achieving an IoU of 64.70 compared to U-Net (60.14), Terramind (62.33), DOFA (59.15) and Prithvi 2.0 (61.91). Our proposed simple yet effective Prithvi-CAFE demonstrates strong potential for improving segmentation tasks where multi-channel and multi-modal data provide complementary information and local details are critical. The code is released on https://github.com/Sk-2103/Prithvi-CAFE{Prithvi-CAFE Github}
PDF01January 7, 2026