MAESTRO：面向多模态、多时相及多光谱地球观测数据的掩码自编码器

摘要

自監督學習在遙感領域展現出巨大潛力，但標準的自監督方法必須針對地球觀測數據的獨特特性進行調整。我們在此方向上邁出一步，對多模態、多時序和多光譜地球觀測數據的融合策略及重建目標歸一化方案進行了全面基準測試。基於研究發現，我們提出了MAESTRO，這是一種對掩碼自編碼器的新穎改進，其特點在於優化的融合策略和定制的目標歸一化方案，該方案引入光譜先驗作為自監督信號。在四個地球觀測數據集上的評估表明，MAESTRO在強依賴多時序動態的任務上設定了新的技術標準，同時在單一時序模態主導的任務上保持高度競爭力。重現我們所有實驗的代碼可在https://github.com/ignf/maestro獲取。

English

Self-supervised learning holds great promise for remote sensing, but standard self-supervised methods must be adapted to the unique characteristics of Earth observation data. We take a step in this direction by conducting a comprehensive benchmark of fusion strategies and reconstruction target normalization schemes for multimodal, multitemporal, and multispectral Earth observation data. Based on our findings, we propose MAESTRO, a novel adaptation of the Masked Autoencoder, featuring optimized fusion strategies and a tailored target normalization scheme that introduces a spectral prior as a self-supervisory signal. Evaluated on four Earth observation datasets, MAESTRO sets a new state-of-the-art on tasks that strongly rely on multitemporal dynamics, while remaining highly competitive on tasks dominated by a single mono-temporal modality. Code to reproduce all our experiments is available at https://github.com/ignf/maestro.

MAESTRO：面向多模态、多时相及多光谱地球观测数据的掩码自编码器

MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data

摘要

Support