リモートセンシング変化検出のための状態空間モデルの変更

要旨

変化検出において頻繁に使用されるにもかかわらず、ConvNetsとVision Transformers（ViT）は両者ともよく知られた限界を示しています。具体的には、前者は長距離依存関係をモデル化するのに苦労し、後者は計算効率が低く、大規模データセットでの学習が困難です。State Space Modelsに基づくアーキテクチャであるVision Mambaは、これらの欠点を解決する代替手段として登場し、すでにリモートセンシングの変化検出に適用されていますが、主に特徴抽出のバックボーンとして使用されています。本論文では、二時相画像間の関連する変化に焦点を当て、無関係な情報を効果的にフィルタリングするために特別に設計されたChange State Space Modelを紹介します。変化した特徴のみに集中することで、ネットワークのパラメータ数を削減し、計算効率を大幅に向上させながら、高い検出性能と入力劣化に対する堅牢性を維持します。提案モデルは3つのベンチマークデータセットで評価され、ConvNets、ViTs、およびMambaベースのモデルを計算複雑性の一部で上回りました。実装は受理後、https://github.com/Elman295/CSSMで公開されます。

English

Despite their frequent use for change detection, both ConvNets and Vision transformers (ViT) exhibit well-known limitations, namely the former struggle to model long-range dependencies while the latter are computationally inefficient, rendering them challenging to train on large-scale datasets. Vision Mamba, an architecture based on State Space Models has emerged as an alternative addressing the aforementioned deficiencies and has been already applied to remote sensing change detection, though mostly as a feature extracting backbone. In this article the Change State Space Model is introduced, that has been specifically designed for change detection by focusing on the relevant changes between bi-temporal images, effectively filtering out irrelevant information. By concentrating solely on the changed features, the number of network parameters is reduced, enhancing significantly computational efficiency while maintaining high detection performance and robustness against input degradation. The proposed model has been evaluated via three benchmark datasets, where it outperformed ConvNets, ViTs, and Mamba-based counterparts at a fraction of their computational complexity. The implementation will be made available at https://github.com/Elman295/CSSM upon acceptance.

リモートセンシング変化検出のための状態空間モデルの変更

Change State Space Models for Remote Sensing Change Detection

要旨

Support