MAESTRO: マルチモーダル・マルチテンポラル・マルチスペクトル地球観測データのためのマスクドオートエンコーダ

要旨

自己教師あり学習はリモートセンシングにおいて大きな可能性を秘めていますが、標準的な自己教師あり手法は地球観測データの独自の特性に適応させる必要があります。本研究では、多モーダル・多時期・多スペクトルの地球観測データに対する融合戦略と再構成ターゲットの正規化手法に関する包括的なベンチマークを実施し、この方向性に向けた一歩を踏み出しました。我々の知見に基づき、MAESTROという新しいMasked Autoencoderの適応版を提案します。MAESTROは最適化された融合戦略と、スペクトル事前分布を自己教師信号として導入する独自のターゲット正規化手法を特徴としています。4つの地球観測データセットで評価を行った結果、MAESTROは多時期ダイナミクスに強く依存するタスクにおいて新たな最先端を達成し、単一の単時期モダリティが支配的なタスクにおいても高い競争力を維持しました。全ての実験を再現するコードはhttps://github.com/ignf/maestroで公開されています。

English

Self-supervised learning holds great promise for remote sensing, but standard self-supervised methods must be adapted to the unique characteristics of Earth observation data. We take a step in this direction by conducting a comprehensive benchmark of fusion strategies and reconstruction target normalization schemes for multimodal, multitemporal, and multispectral Earth observation data. Based on our findings, we propose MAESTRO, a novel adaptation of the Masked Autoencoder, featuring optimized fusion strategies and a tailored target normalization scheme that introduces a spectral prior as a self-supervisory signal. Evaluated on four Earth observation datasets, MAESTRO sets a new state-of-the-art on tasks that strongly rely on multitemporal dynamics, while remaining highly competitive on tasks dominated by a single mono-temporal modality. Code to reproduce all our experiments is available at https://github.com/ignf/maestro.

MAESTRO: マルチモーダル・マルチテンポラル・マルチスペクトル地球観測データのためのマスクドオートエンコーダ

MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data

要旨

Support